commit
b63a636512
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
name: Bug report
|
||||
about: Create a report to help us improve
|
||||
title: ''
|
||||
labels: bug
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
**Describe the bug**
|
||||
A clear and concise description of what the bug is.
|
||||
|
||||
**To Reproduce**
|
||||
|
||||
1. If any, include the code that you are using
|
||||
2. If any, include the command line statements that you are using
|
||||
3. If you have problems with a specific pdf file, include that pdf file
|
||||
|
||||
**Expected behavior**
|
||||
A clear and concise description of what you expected to happen.
|
|
@ -0,0 +1,17 @@
|
|||
---
|
||||
name: Feature request
|
||||
about: Suggest an improvement for this project
|
||||
title: ''
|
||||
labels: enhancement
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
**Is your feature request related to a problem? Please describe.**
|
||||
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
|
||||
|
||||
**Describe the solution you'd like**
|
||||
A clear and concise description of what you want to happen.
|
||||
|
||||
**Describe alternatives you've considered**
|
||||
A clear and concise description of any alternative solutions or features you've considered.
|
|
@ -0,0 +1,17 @@
|
|||
**Description**
|
||||
|
||||
Please include a summary of the change and which issue is fixed. If this does not fix an issue, then first create a new issue. Please also include relevant motivation and context.
|
||||
|
||||
Fixes # (issue)
|
||||
|
||||
**How Has This Been Tested?**
|
||||
|
||||
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Include an example pdf if you have one.
|
||||
|
||||
**Checklist**
|
||||
|
||||
- [ ] I have added tests that prove my fix is effective or that my feature works
|
||||
- [ ] I have updated the [README.md](../README.md) and other documentation, or I am sure that this is not necessary
|
||||
- [ ] I have added a consice human-readable description of the change to [CHANGELOG.md](../CHANGELOG.md)
|
||||
- [ ] I have added docstrings to newly created methods and classes
|
||||
- [ ] I have optimized the code at least one time after creating the initial version
|
|
@ -4,7 +4,9 @@ python:
|
|||
- "3.4"
|
||||
- "3.5"
|
||||
- "3.6"
|
||||
- "3.7"
|
||||
- "3.8"
|
||||
install:
|
||||
- pip install tox-travis
|
||||
script:
|
||||
- tox
|
||||
- tox -r
|
||||
|
|
29
CHANGELOG.md
29
CHANGELOG.md
|
@ -7,6 +7,33 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
|||
|
||||
Nothing yet
|
||||
|
||||
## [20191107] - 2019-11-07
|
||||
|
||||
### Deprecated
|
||||
- The argument `_py2_no_more_posargs` because Python2 is removed on January
|
||||
, 2020 ([#328](https://github.com/pdfminer/pdfminer.six/pull/328) and
|
||||
[#307](https://github.com/pdfminer/pdfminer.six/pull/307))
|
||||
|
||||
### Added
|
||||
- Simple wrapper to easily extract text from a PDF file [#330](https://github.com/pdfminer/pdfminer.six/pull/330)
|
||||
- Support for extracting JBIG2 encoded images ([#311](https://github.com/pdfminer/pdfminer.six/pull/311) and [#46](https://github.com/pdfminer/pdfminer.six/pull/46))
|
||||
- Sphinx documentation that is published on
|
||||
[Read the Docs](https://pdfminersix.readthedocs.io/)
|
||||
([#329](https://github.com/pdfminer/pdfminer.six/pull/329))
|
||||
|
||||
### Fixed
|
||||
- Unhandled AssertionError when dumping pdf containing reference to object id 0
|
||||
([#318](https://github.com/pdfminer/pdfminer.six/pull/318))
|
||||
- Debug flag actually changes logging level to debug for pdf2txt.py and
|
||||
dumppdf.py ([#325](https://github.com/pdfminer/pdfminer.six/pull/325))
|
||||
|
||||
### Changed
|
||||
- Using argparse instead of getopt for command line interface of dumppdf.py ([#321](https://github.com/pdfminer/pdfminer.six/pull/321))
|
||||
- Refactor `LTLayoutContainer.group_textboxes` for a significant speed up in layout analysis ([#315](https://github.com/pdfminer/pdfminer.six/pull/315))
|
||||
|
||||
### Removed
|
||||
- Files for external applications such as django, cgi and pyinstaller ([#314](https://github.com/pdfminer/pdfminer.six/issues/314))
|
||||
|
||||
## [20191020] - 2019-10-20
|
||||
|
||||
### Deprecated
|
||||
|
@ -27,7 +54,7 @@ Nothing yet
|
|||
- Allow for bounding boxes with zero height or width by removing assertion ([#246](https://github.com/pdfminer/pdfminer.six/pull/246))
|
||||
|
||||
### Changed
|
||||
- All dependencies are managed in `setup.py` ([#306](https://github.com/pdfminer/pdfminer.six/pull/306), [#219](https://github.com/pdfminer/pdfminer.six/pull/219))
|
||||
- All dependencies are managed in `setup.py` ([#306](https://github.com/pdfminer/pdfminer.six/pull/306) and [#219](https://github.com/pdfminer/pdfminer.six/pull/219))
|
||||
|
||||
## [20181108] - 2018-11-08
|
||||
|
||||
|
|
69
README.md
69
README.md
|
@ -1,21 +1,22 @@
|
|||
PDFMiner.six
|
||||
pdfminer.six
|
||||
============
|
||||
|
||||
PDFMiner.six is a fork of PDFMiner using six for Python 2+3 compatibility
|
||||
[![Build Status](https://travis-ci.org/pdfminer/pdfminer.six.svg?branch=master)](https://travis-ci.org/pdfminer/pdfminer.six)
|
||||
[![PyPI version](https://img.shields.io/pypi/v/pdfminer.six.svg)](https://pypi.python.org/pypi/pdfminer.six/)
|
||||
[![gitter](https://badges.gitter.im/pdfminer-six/Lobby.svg)](https://gitter.im/pdfminer-six/Lobby?utm_source=badge&utm_medium)
|
||||
|
||||
[![Build Status](https://travis-ci.org/pdfminer/pdfminer.six.svg?branch=master)](https://travis-ci.org/pdfminer/pdfminer.six) [![PyPI version](https://img.shields.io/pypi/v/pdfminer.six.svg)](https://pypi.python.org/pypi/pdfminer.six/)
|
||||
|
||||
PDFMiner is a tool for extracting information from PDF documents.
|
||||
Pdfminer.six is an community maintained fork of the original PDFMiner. It is a
|
||||
tool for extracting information from PDF documents.
|
||||
Unlike other PDF-related tools, it focuses entirely on getting
|
||||
and analyzing text data. PDFMiner allows one to obtain
|
||||
and analyzing text data. Pdfminer.six allows one to obtain
|
||||
the exact location of text in a page, as well as
|
||||
other information such as fonts or lines.
|
||||
It includes a PDF converter that can transform PDF files
|
||||
into other text formats (such as HTML). It has an extensible
|
||||
PDF parser that can be used for other purposes than text analysis.
|
||||
|
||||
* Webpage: https://github.com/pdfminer/
|
||||
* Download (PyPI): https://pypi.python.org/pypi/pdfminer.six/
|
||||
Check out the full documentation on
|
||||
[Read the Docs](https://pdfminersix.readthedocs.io).
|
||||
|
||||
|
||||
Features
|
||||
|
@ -23,62 +24,30 @@ Features
|
|||
|
||||
* Written entirely in Python.
|
||||
* Parse, analyze, and convert PDF documents.
|
||||
* PDF-1.7 specification support. (well, almost)
|
||||
* PDF-1.7 specification support. (well, almost).
|
||||
* CJK languages and vertical writing scripts support.
|
||||
* Various font types (Type1, TrueType, Type3, and CID) support.
|
||||
* Support for extracting images (JPG, JBIG2 and Bitmaps).
|
||||
* Basic encryption (RC4) support.
|
||||
* Outline (TOC) extraction.
|
||||
* Tagged contents extraction.
|
||||
* Automatic layout analysis.
|
||||
|
||||
|
||||
How to Install
|
||||
--------------
|
||||
How to use
|
||||
----------
|
||||
|
||||
* Install Python 2.7 or newer.
|
||||
* Install
|
||||
* Install Python 2.7 or newer. Note that Python 2 support is dropped at
|
||||
January, 2020.
|
||||
|
||||
`pip install pdfminer.six`
|
||||
|
||||
* Run the following test:
|
||||
* Use command-line interface to extract text from pdf:
|
||||
|
||||
`pdf2txt.py samples/simple1.pdf`
|
||||
`python pdf2txt.py samples/simple1.pdf`
|
||||
|
||||
|
||||
Command Line Tools
|
||||
------------------
|
||||
|
||||
PDFMiner comes with two handy tools:
|
||||
pdf2txt.py and dumppdf.py.
|
||||
|
||||
**pdf2txt.py**
|
||||
|
||||
pdf2txt.py extracts text contents from a PDF file.
|
||||
It extracts all the text that are to be rendered programmatically,
|
||||
i.e. text represented as ASCII or Unicode strings.
|
||||
It cannot recognize text drawn as images that would require optical character recognition.
|
||||
It also extracts the corresponding locations, font names, font sizes, writing
|
||||
direction (horizontal or vertical) for each text portion.
|
||||
You need to provide a password for protected PDF documents when its access is restricted.
|
||||
You cannot extract any text from a PDF document which does not have extraction permission.
|
||||
|
||||
(For details, refer to /docs/index.html.)
|
||||
|
||||
**dumppdf.py**
|
||||
|
||||
dumppdf.py dumps the internal contents of a PDF file in pseudo-XML format.
|
||||
This program is primarily for debugging purposes,
|
||||
but it's also possible to extract some meaningful contents (e.g. images).
|
||||
|
||||
(For details, refer to /docs/index.html.)
|
||||
|
||||
|
||||
TODO
|
||||
----
|
||||
|
||||
* PEP-8 and PEP-257 conformance.
|
||||
* Better documentation.
|
||||
* Performance improvements.
|
||||
* Check out more examples and documentation on
|
||||
[Read the Docs](https://pdfminersix.readthedocs.io).
|
||||
|
||||
|
||||
Contributing
|
||||
|
|
|
@ -0,0 +1 @@
|
|||
build/
|
|
@ -0,0 +1,20 @@
|
|||
# Minimal makefile for Sphinx documentation
|
||||
#
|
||||
|
||||
# You can set these variables from the command line, and also
|
||||
# from the environment for the first two.
|
||||
SPHINXOPTS ?=
|
||||
SPHINXBUILD ?= sphinx-build
|
||||
SOURCEDIR = source
|
||||
BUILDDIR = build
|
||||
|
||||
# Put it first so that "make" without argument is like "make help".
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
225
docs/cid.obj
225
docs/cid.obj
|
@ -1,225 +0,0 @@
|
|||
%TGIF 4.1.45-QPL
|
||||
state(0,37,100.000,0,0,0,16,1,9,1,1,2,0,1,0,1,1,'NewCenturySchlbk-Bold',1,103680,0,0,1,10,0,0,1,1,0,16,0,0,1,1,1,1,1050,1485,1,0,2880,0).
|
||||
%
|
||||
% @(#)$Header$
|
||||
% %W%
|
||||
%
|
||||
unit("1 pixel/pixel").
|
||||
color_info(19,65535,0,[
|
||||
"magenta", 65535, 0, 65535, 65535, 0, 65535, 1,
|
||||
"red", 65535, 0, 0, 65535, 0, 0, 1,
|
||||
"green", 0, 65535, 0, 0, 65535, 0, 1,
|
||||
"blue", 0, 0, 65535, 0, 0, 65535, 1,
|
||||
"yellow", 65535, 65535, 0, 65535, 65535, 0, 1,
|
||||
"pink", 65535, 49344, 52171, 65535, 49344, 52171, 1,
|
||||
"cyan", 0, 65535, 65535, 0, 65535, 65535, 1,
|
||||
"CadetBlue", 24415, 40606, 41120, 24415, 40606, 41120, 1,
|
||||
"white", 65535, 65535, 65535, 65535, 65535, 65535, 1,
|
||||
"black", 0, 0, 0, 0, 0, 0, 1,
|
||||
"DarkSlateGray", 12079, 20303, 20303, 12079, 20303, 20303, 1,
|
||||
"#00000000c000", 0, 0, 49344, 0, 0, 49152, 1,
|
||||
"#820782070000", 33410, 33410, 0, 33287, 33287, 0, 1,
|
||||
"#3cf3fbee34d2", 15420, 64507, 13364, 15603, 64494, 13522, 1,
|
||||
"#3cf3fbed34d3", 15420, 64507, 13364, 15603, 64493, 13523, 1,
|
||||
"#ffffa6990000", 65535, 42662, 0, 65535, 42649, 0, 1,
|
||||
"#ffff0000fffe", 65535, 0, 65535, 65535, 0, 65534, 1,
|
||||
"#fffe0000fffe", 65535, 0, 65535, 65534, 0, 65534, 1,
|
||||
"#fffe00000000", 65535, 0, 0, 65534, 0, 0, 1
|
||||
]).
|
||||
script_frac("0.6").
|
||||
fg_bg_colors('black','white').
|
||||
dont_reencode("FFDingbests:ZapfDingbats").
|
||||
objshadow_info('#c0c0c0',2,2).
|
||||
page(1,"",1,'').
|
||||
text('black',90,95,1,1,1,66,20,0,15,5,0,0,0,0,2,66,20,0,0,"",0,0,0,0,110,'',[
|
||||
minilines(66,20,0,0,1,0,0,[
|
||||
mini_line(66,15,5,0,0,0,[
|
||||
str_block(0,66,15,5,0,-1,0,0,0,[
|
||||
str_seg('black','Courier-Bold',1,103680,66,15,5,0,-1,0,0,0,0,0,
|
||||
"U+30FC")])
|
||||
])
|
||||
])]).
|
||||
text('black',100,285,1,1,1,66,20,3,15,5,0,0,0,0,2,66,20,0,0,"",0,0,0,0,300,'',[
|
||||
minilines(66,20,0,0,1,0,0,[
|
||||
mini_line(66,15,5,0,0,0,[
|
||||
str_block(0,66,15,5,0,-2,0,0,0,[
|
||||
str_seg('black','Courier-Bold',1,103680,66,15,5,0,-2,0,0,0,0,0,
|
||||
"U+5199")])
|
||||
])
|
||||
])]).
|
||||
text('black',400,38,2,1,1,119,30,5,12,3,0,0,0,0,2,119,30,0,0,"",0,0,0,0,50,'',[
|
||||
minilines(119,30,0,0,1,0,0,[
|
||||
mini_line(83,12,3,0,0,0,[
|
||||
str_block(0,83,12,3,0,-3,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,83,12,3,0,-3,0,0,0,0,0,
|
||||
"Adobe-Japan1")])
|
||||
]),
|
||||
mini_line(119,12,3,0,0,0,[
|
||||
str_block(0,119,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,119,12,3,0,-1,0,0,0,0,0,
|
||||
"CID:660 (horizontal)")])
|
||||
])
|
||||
])]).
|
||||
text('black',400,118,2,1,1,114,30,8,12,3,0,0,0,0,2,114,30,0,0,"",0,0,0,0,130,'',[
|
||||
minilines(114,30,0,0,1,0,0,[
|
||||
mini_line(83,12,3,0,0,0,[
|
||||
str_block(0,83,12,3,0,-3,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,83,12,3,0,-3,0,0,0,0,0,
|
||||
"Adobe-Japan1")])
|
||||
]),
|
||||
mini_line(114,12,3,0,0,0,[
|
||||
str_block(0,114,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,114,12,3,0,-1,0,0,0,0,0,
|
||||
"CID:7891 (vertical)")])
|
||||
])
|
||||
])]).
|
||||
text('black',400,238,2,1,1,125,30,15,12,3,0,0,0,0,2,125,30,0,0,"",0,0,0,0,250,'',[
|
||||
minilines(125,30,0,0,1,0,0,[
|
||||
mini_line(83,12,3,0,0,0,[
|
||||
str_block(0,83,12,3,0,-3,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,83,12,3,0,-3,0,0,0,0,0,
|
||||
"Adobe-Japan1")])
|
||||
]),
|
||||
mini_line(125,12,3,0,0,0,[
|
||||
str_block(0,125,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,125,12,3,0,-1,0,0,0,0,0,
|
||||
"CID:2296 (Japanese)")])
|
||||
])
|
||||
])]).
|
||||
text('black',400,318,2,1,1,115,30,16,12,3,0,0,0,0,2,115,30,0,0,"",0,0,0,0,330,'',[
|
||||
minilines(115,30,0,0,1,0,0,[
|
||||
mini_line(67,12,3,0,0,0,[
|
||||
str_block(0,67,12,3,0,-3,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,67,12,3,0,-3,0,0,0,0,0,
|
||||
"Adobe-GB1")])
|
||||
]),
|
||||
mini_line(115,12,3,0,0,0,[
|
||||
str_block(0,115,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,115,12,3,0,-1,0,0,0,0,0,
|
||||
"CID:3967 (Chinese)")])
|
||||
])
|
||||
])]).
|
||||
text('black',200,84,2,1,1,116,38,20,16,3,0,0,0,0,2,116,38,0,0,"",0,0,0,0,100,'',[
|
||||
minilines(116,38,0,0,1,0,0,[
|
||||
mini_line(70,16,3,0,0,0,[
|
||||
str_block(0,70,16,3,0,-1,0,0,0,[
|
||||
str_seg('black','NewCenturySchlbk-Roman',0,97920,70,16,3,0,-1,0,0,0,0,0,
|
||||
"Japanese")])
|
||||
]),
|
||||
mini_line(116,16,3,0,0,0,[
|
||||
str_block(0,116,16,3,0,-1,0,0,0,[
|
||||
str_seg('black','NewCenturySchlbk-Roman',0,97920,116,16,3,0,-1,0,0,0,0,0,
|
||||
"long-vowel sign")])
|
||||
])
|
||||
])]).
|
||||
oval('black','',30,70,280,140,0,1,1,49,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
oval('black','',30,260,280,330,0,1,1,51,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
text('black',200,274,2,1,1,85,38,53,16,3,0,0,0,0,2,85,38,0,0,"",0,0,0,0,290,'',[
|
||||
minilines(85,38,0,0,1,0,0,[
|
||||
mini_line(61,16,3,0,0,0,[
|
||||
str_block(0,61,16,3,0,-1,0,0,0,[
|
||||
str_seg('black','NewCenturySchlbk-Roman',0,97920,61,16,3,0,-1,0,0,0,0,0,
|
||||
"Chinese")])
|
||||
]),
|
||||
mini_line(85,16,3,0,0,0,[
|
||||
str_block(0,85,16,3,0,-1,0,0,0,[
|
||||
str_seg('black','NewCenturySchlbk-Roman',0,97920,85,16,3,0,-1,0,0,0,0,0,
|
||||
"letter \"sha\"")])
|
||||
])
|
||||
])]).
|
||||
box('black','',330,30,560,80,0,1,1,57,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
box('black','',330,110,560,160,0,1,1,59,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
box('black','',330,230,560,280,0,1,1,60,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
box('black','',330,310,560,360,0,1,1,61,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
group([
|
||||
poly('black','',4,[
|
||||
506,246,501,235,541,235,536,246],0,2,1,68,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]),
|
||||
poly('black','',5,[
|
||||
519,238,516,252,529,252,524,275,516,272],0,2,1,69,0,0,0,0,0,0,0,'2',0,0,
|
||||
"00","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]),
|
||||
poly('black','',2,[
|
||||
501,261,541,261],0,2,1,70,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]),
|
||||
poly('black','',2,[
|
||||
519,244,529,244],0,2,1,71,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
])
|
||||
],
|
||||
76,0,0,[
|
||||
]).
|
||||
group([
|
||||
poly('black','',3,[
|
||||
519,119,524,127,524,152],0,2,1,67,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
])
|
||||
],
|
||||
78,0,0,[
|
||||
]).
|
||||
group([
|
||||
poly('black','',3,[
|
||||
540,57,509,57,501,49],0,2,1,66,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
])
|
||||
],
|
||||
80,0,0,[
|
||||
]).
|
||||
group([
|
||||
poly('black','',4,[
|
||||
506,326,501,315,541,315,536,326],0,2,1,90,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]),
|
||||
poly('black','',5,[
|
||||
519,318,515,332,531,332,526,355,519,352],0,2,1,89,0,0,0,0,0,0,0,'2',0,0,
|
||||
"00","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]),
|
||||
poly('black','',2,[
|
||||
501,341,526,341],0,2,1,88,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]),
|
||||
poly('black','',2,[
|
||||
519,324,529,324],0,2,1,87,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
])
|
||||
],
|
||||
134,0,0,[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
270,90,320,70],1,3,1,158,0,0,0,0,0,0,0,'3',0,0,
|
||||
"0","",[
|
||||
0,12,5,0,'12','5','0'],[0,12,5,0,'12','5','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
280,110,320,130],1,3,1,159,0,0,0,0,0,0,0,'3',0,0,
|
||||
"0","",[
|
||||
0,12,5,0,'12','5','0'],[0,12,5,0,'12','5','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
270,280,310,250],1,3,1,160,0,0,0,0,0,0,0,'3',0,0,
|
||||
"0","",[
|
||||
0,12,5,0,'12','5','0'],[0,12,5,0,'12','5','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
270,300,310,330],1,3,1,161,0,0,0,0,0,0,0,'3',0,0,
|
||||
"0","",[
|
||||
0,12,5,0,'12','5','0'],[0,12,5,0,'12','5','0'],[
|
||||
]).
|
BIN
docs/cid.png
BIN
docs/cid.png
Binary file not shown.
Before Width: | Height: | Size: 2.6 KiB |
427
docs/index.html
427
docs/index.html
|
@ -1,427 +0,0 @@
|
|||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
|
||||
<html>
|
||||
<head>
|
||||
<link rel="stylesheet" type="text/css" href="style.css">
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||||
<title>PDFMiner</title>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div align=right class=lastmod>
|
||||
<!-- hhmts start -->
|
||||
Last Modified: Wed Jun 25 10:27:52 UTC 2014
|
||||
<!-- hhmts end -->
|
||||
</div>
|
||||
|
||||
<h1>PDFMiner</h1>
|
||||
<p>
|
||||
Python PDF parser and analyzer
|
||||
|
||||
<p>
|
||||
<a href="http://www.unixuser.org/~euske/python/pdfminer/index.html">Homepage</a>
|
||||
|
||||
<a href="#changes">Recent Changes</a>
|
||||
|
||||
<a href="programming.html">PDFMiner API</a>
|
||||
|
||||
<ul>
|
||||
<li> <a href="#intro">What's It?</a>
|
||||
<li> <a href="#download">Download</a>
|
||||
<li> <a href="#wheretoask">Where to Ask</a>
|
||||
<li> <a href="#install">How to Install</a>
|
||||
<ul>
|
||||
<li> <a href="#cmap">CJK languages support</a>
|
||||
</ul>
|
||||
<li> <a href="#tools">Command Line Tools</a>
|
||||
<ul>
|
||||
<li> <a href="#pdf2txt">pdf2txt.py</a>
|
||||
<li> <a href="#dumppdf">dumppdf.py</a>
|
||||
<li> <a href="programming.html">PDFMiner API</a>
|
||||
</ul>
|
||||
<li> <a href="#changes">Changes</a>
|
||||
<li> <a href="#todo">TODO</a>
|
||||
<li> <a href="#related">Related Projects</a>
|
||||
<li> <a href="#license">Terms and Conditions</a>
|
||||
</ul>
|
||||
|
||||
<h2><a name="intro">What's It?</a></h2>
|
||||
<p>
|
||||
PDFMiner is a tool for extracting information from PDF documents.
|
||||
Unlike other PDF-related tools, it focuses entirely on getting
|
||||
and analyzing text data. PDFMiner allows one to obtain
|
||||
the exact location of text in a page, as well as
|
||||
other information such as fonts or lines.
|
||||
It includes a PDF converter that can transform PDF files
|
||||
into other text formats (such as HTML). It has an extensible
|
||||
PDF parser that can be used for other purposes than text analysis.
|
||||
|
||||
<p>
|
||||
<h3>Features</h3>
|
||||
<ul>
|
||||
<li> Written entirely in Python. (for version 2.6 or newer)
|
||||
<li> Parse, analyze, and convert PDF documents.
|
||||
<li> PDF-1.7 specification support. (well, almost)
|
||||
<li> CJK languages and vertical writing scripts support.
|
||||
<li> Various font types (Type1, TrueType, Type3, and CID) support.
|
||||
<li> Basic encryption (RC4) support.
|
||||
<li> PDF to HTML conversion (with a sample converter web app).
|
||||
<li> Outline (TOC) extraction.
|
||||
<li> Tagged contents extraction.
|
||||
<li> Reconstruct the original layout by grouping text chunks.
|
||||
</ul>
|
||||
<p>
|
||||
PDFMiner is about 20 times slower than
|
||||
other C/C++-based counterparts such as XPdf.
|
||||
|
||||
<P>
|
||||
<strong>Online Demo:</strong> (pdf -> html conversion webapp)<br>
|
||||
<a href="http://pdf2html.tabesugi.net:8080/">
|
||||
http://pdf2html.tabesugi.net:8080/
|
||||
</a>
|
||||
|
||||
<h3><a name="download">Download</a></h3>
|
||||
<p>
|
||||
<strong>Source distribution:</strong><br>
|
||||
<a href="http://pypi.python.org/pypi/pdfminer_six/">
|
||||
http://pypi.python.org/pypi/pdfminer_six/
|
||||
</a>
|
||||
|
||||
<P>
|
||||
<strong>github:</strong><br>
|
||||
<a href="https://github.com/goulu/pdfminer/">
|
||||
https://github.com/goulu/pdfminer/
|
||||
</a>
|
||||
|
||||
<h3><a name="wheretoask">Where to Ask</a></h3>
|
||||
<p>
|
||||
<p>
|
||||
<strong>Questions and comments:</strong><br>
|
||||
<a href="http://groups.google.com/group/pdfminer-users/">
|
||||
http://groups.google.com/group/pdfminer-users/
|
||||
</a>
|
||||
|
||||
<h2><a name="install">How to Install</a></h2>
|
||||
<ol>
|
||||
<li> Install <a href="http://www.python.org/download/">Python</a> 2.6 or newer.
|
||||
<li> Download the <a href="#source">PDFMiner source</a>.
|
||||
<li> Unpack it.
|
||||
<li> Run <code>setup.py</code> to install:<br>
|
||||
<blockquote><pre>
|
||||
# <strong>python setup.py install</strong>
|
||||
</pre></blockquote>
|
||||
<li> Do the following test:<br>
|
||||
<blockquote><pre>
|
||||
$ <strong>pdf2txt.py samples/simple1.pdf</strong>
|
||||
Hello
|
||||
|
||||
World
|
||||
|
||||
Hello
|
||||
|
||||
World
|
||||
|
||||
H e l l o
|
||||
|
||||
W o r l d
|
||||
|
||||
H e l l o
|
||||
|
||||
W o r l d
|
||||
</pre></blockquote>
|
||||
<li> Done!
|
||||
</ol>
|
||||
|
||||
<h3><a name="cmap">For CJK languages</a></h3>
|
||||
<p>
|
||||
In order to process CJK languages, you need an additional step to take
|
||||
during installation:
|
||||
<blockquote><pre>
|
||||
# <strong>make cmap</strong>
|
||||
python tools/conv_cmap.py pdfminer/cmap Adobe-CNS1 cmaprsrc/cid2code_Adobe_CNS1.txt
|
||||
reading 'cmaprsrc/cid2code_Adobe_CNS1.txt'...
|
||||
writing 'CNS1_H.py'...
|
||||
...
|
||||
<em>(this may take several minutes)</em>
|
||||
|
||||
# <strong>python setup.py install</strong>
|
||||
</pre></blockquote>
|
||||
|
||||
<p>
|
||||
On Windows machines which don't have <code>make</code> command,
|
||||
paste the following commands on a command line prompt:
|
||||
<blockquote><pre>
|
||||
<strong>mkdir pdfminer\cmap</strong>
|
||||
<strong>python tools\conv_cmap.py -c B5=cp950 -c UniCNS-UTF8=utf-8 pdfminer\cmap Adobe-CNS1 cmaprsrc\cid2code_Adobe_CNS1.txt</strong>
|
||||
<strong>python tools\conv_cmap.py -c GBK-EUC=cp936 -c UniGB-UTF8=utf-8 pdfminer\cmap Adobe-GB1 cmaprsrc\cid2code_Adobe_GB1.txt</strong>
|
||||
<strong>python tools\conv_cmap.py -c RKSJ=cp932 -c EUC=euc-jp -c UniJIS-UTF8=utf-8 pdfminer\cmap Adobe-Japan1 cmaprsrc\cid2code_Adobe_Japan1.txt</strong>
|
||||
<strong>python tools\conv_cmap.py -c KSC-EUC=euc-kr -c KSC-Johab=johab -c KSCms-UHC=cp949 -c UniKS-UTF8=utf-8 pdfminer\cmap Adobe-Korea1 cmaprsrc\cid2code_Adobe_Korea1.txt</strong>
|
||||
<strong>python setup.py install</strong>
|
||||
</pre></blockquote>
|
||||
|
||||
<h2><a name="tools">Command Line Tools</a></h2>
|
||||
<p>
|
||||
PDFMiner comes with two handy tools:
|
||||
<code>pdf2txt.py</code> and <code>dumppdf.py</code>.
|
||||
|
||||
<h3><a name="pdf2txt">pdf2txt.py</a></h3>
|
||||
<p>
|
||||
<code>pdf2txt.py</code> extracts text contents from a PDF file.
|
||||
It extracts all the text that are to be rendered programmatically,
|
||||
i.e. text represented as ASCII or Unicode strings.
|
||||
It cannot recognize text drawn as images that would require optical character recognition.
|
||||
It also extracts the corresponding locations, font names, font sizes, writing
|
||||
direction (horizontal or vertical) for each text portion.
|
||||
You need to provide a password for protected PDF documents when its access is restricted.
|
||||
You cannot extract any text from a PDF document which does not have extraction permission.
|
||||
|
||||
<p>
|
||||
<strong>Note:</strong>
|
||||
Not all characters in a PDF can be safely converted to Unicode.
|
||||
|
||||
<h4>Examples</h4>
|
||||
<blockquote><pre>
|
||||
$ <strong>pdf2txt.py -o output.html samples/naacl06-shinyama.pdf</strong>
|
||||
(extract text as an HTML file whose filename is output.html)
|
||||
|
||||
$ <strong>pdf2txt.py -V -c euc-jp -o output.html samples/jo.pdf</strong>
|
||||
(extract a Japanese HTML file in vertical writing, CMap is required)
|
||||
|
||||
$ <strong>pdf2txt.py -P mypassword -o output.txt secret.pdf</strong>
|
||||
(extract a text from an encrypted PDF file)
|
||||
</pre></blockquote>
|
||||
|
||||
<h4>Options</h4>
|
||||
<dl>
|
||||
<dt> <code>-o <em>filename</em></code>
|
||||
<dd> Specifies the output file name.
|
||||
By default, it prints the extracted contents to stdout in text format.
|
||||
<p>
|
||||
<dt> <code>-p <em>pageno[,pageno,...]</em></code>
|
||||
<dd> Specifies the comma-separated list of the page numbers to be extracted.
|
||||
Page numbers start at one.
|
||||
By default, it extracts text from all the pages.
|
||||
<p>
|
||||
<dt> <code>-c <em>codec</em></code>
|
||||
<dd> Specifies the output codec.
|
||||
<p>
|
||||
<dt> <code>-t <em>type</em></code>
|
||||
<dd> Specifies the output format. The following formats are currently supported.
|
||||
<ul>
|
||||
<li> <code>text</code> : TEXT format. (Default)
|
||||
<li> <code>html</code> : HTML format. Not recommended for extraction purposes because the markup is messy.
|
||||
<li> <code>xml</code> : XML format. Provides the most information.
|
||||
<li> <code>tag</code> : "Tagged PDF" format. A tagged PDF has its own contents annotated with
|
||||
HTML-like tags. pdf2txt tries to extract its content streams rather than inferring its text locations.
|
||||
Tags used here are defined in the PDF specification (See §10.7 "<em>Tagged PDF</em>").
|
||||
</ul>
|
||||
<p>
|
||||
<dt> <code>-I <em>image_directory</em></code>
|
||||
<dd> Specifies the output directory for image extraction.
|
||||
Currently only JPEG images are supported.
|
||||
<p>
|
||||
<dt> <code>-M <em>char_margin</em></code>
|
||||
<dt> <code>-L <em>line_margin</em></code>
|
||||
<dt> <code>-W <em>word_margin</em></code>
|
||||
<dd> These are the parameters used for layout analysis.
|
||||
In an actual PDF file, text portions might be split into several chunks
|
||||
in the middle of its running, depending on the authoring software.
|
||||
Therefore, text extraction needs to splice text chunks.
|
||||
In the figure below, two text chunks whose distance is closer than
|
||||
the <em>char_margin</em> (shown as <em><font color="red">M</font></em>) is considered
|
||||
continuous and get grouped into one. Also, two lines whose distance is closer than
|
||||
the <em>line_margin</em> (<em><font color="blue">L</font></em>) is grouped
|
||||
as a text box, which is a rectangular area that contains a "cluster" of text portions.
|
||||
Furthermore, it may be required to insert blank characters (spaces) as necessary
|
||||
if the distance between two words is greater than the <em>word_margin</em>
|
||||
(<em><font color="green">W</font></em>), as a blank between words might not be
|
||||
represented as a space, but indicated by the positioning of each word.
|
||||
<p>
|
||||
Each value is specified not as an actual length, but as a proportion of
|
||||
the length to the size of each character in question. The default values
|
||||
are M = 2.0, L = 0.5, and W = 0.1, respectively.
|
||||
<table style="border:2px gray solid; margin: 10px; padding: 10px;"><tr>
|
||||
<td style="border-right:1px red solid" align=right>→</td>
|
||||
<td style="border-left:1px red solid" colspan="4" align=left>← <em><font color="red">M</font></em></td>
|
||||
<td></td>
|
||||
</tr><tr>
|
||||
<td style="border:1px solid"><code>Q u i</code></td>
|
||||
<td style="border:1px solid"><code>c k</code></td>
|
||||
<td width="10px"></td>
|
||||
<td style="border:1px solid"><code>b r o w</code></td>
|
||||
<td style="border:1px solid"><code>n f o x</code></td>
|
||||
<td style="border-bottom:1px blue solid" align=right>↓</td>
|
||||
</tr><tr>
|
||||
<td style="border-right:1px green solid" colspan="2" align=right>→</td><td></td>
|
||||
<td style="border-left:1px green solid" colspan="2" align=left>← <em><font color="green">W</font></em></td>
|
||||
<td rowspan="2" valign=center align=center><em><font color="blue">L</font></em></td>
|
||||
</tr><tr height="10px">
|
||||
</tr><tr>
|
||||
<td style="padding:0px;" colspan="5">
|
||||
<table style="border:1px solid"><tr><td><code>j u m p s</code></td><td>...</td></tr></table>
|
||||
</td>
|
||||
<td style="border-top:1px blue solid" align=right>↑</td>
|
||||
</tr></table>
|
||||
<p>
|
||||
<dt> <code>-F <em>boxes_flow</em></code>
|
||||
<dd> Specifies how much a horizontal and vertical position of a text matters
|
||||
when determining a text order. The value should be within the range of
|
||||
-1.0 (only horizontal position matters) to +1.0 (only vertical position matters).
|
||||
The default value is 0.5.
|
||||
<p>
|
||||
<dt> <code>-C</code>
|
||||
<dd> Suppress object caching.
|
||||
This will reduce the memory consumption but also slows down the process.
|
||||
<p>
|
||||
<dt> <code>-n</code>
|
||||
<dd> Suppress layout analysis.
|
||||
<p>
|
||||
<dt> <code>-A</code>
|
||||
<dd> Forces to perform layout analysis for all the text strings,
|
||||
including text contained in figures.
|
||||
<p>
|
||||
<dt> <code>-V</code>
|
||||
<dd> Allows vertical writing detection.
|
||||
<p>
|
||||
<dt> <code>-Y <em>layout_mode</em></code>
|
||||
<dd> Specifies how the page layout should be preserved. (Currently only applies to HTML format.)
|
||||
<ul>
|
||||
<li> <code>exact</code> : preserve the exact location of each individual character (a large and messy HTML).
|
||||
<li> <code>normal</code> : preserve the location and line breaks in each text block. (Default)
|
||||
<li> <code>loose</code> : preserve the overall location of each text block.
|
||||
</ul>
|
||||
<p>
|
||||
<dt> <code>-E <em>extractdir</em></code>
|
||||
<dd> Specifies the extraction directory of embedded files.
|
||||
<p>
|
||||
<dt> <code>-s <em>scale</em></code>
|
||||
<dd> Specifies the output scale. Can be used in HTML format only.
|
||||
<p>
|
||||
<dt> <code>-m <em>maxpages</em></code>
|
||||
<dd> Specifies the maximum number of pages to extract.
|
||||
By default, it extracts all the pages in a document.
|
||||
<p>
|
||||
<dt> <code>-P <em>password</em></code>
|
||||
<dd> Provides the user password to access PDF contents.
|
||||
<p>
|
||||
<dt> <code>-d</code>
|
||||
<dd> Increases the debug level.
|
||||
</dl>
|
||||
|
||||
<hr noshade>
|
||||
|
||||
<h3><a name="dumppdf">dumppdf.py</a></h3>
|
||||
<p>
|
||||
<code>dumppdf.py</code> dumps the internal contents of a PDF file
|
||||
in pseudo-XML format. This program is primarily for debugging purposes,
|
||||
but it's also possible to extract some meaningful contents
|
||||
(such as images).
|
||||
|
||||
<h4>Examples</h4>
|
||||
<blockquote><pre>
|
||||
$ <strong>dumppdf.py -a foo.pdf</strong>
|
||||
(dump all the headers and contents, except stream objects)
|
||||
|
||||
$ <strong>dumppdf.py -T foo.pdf</strong>
|
||||
(dump the table of contents)
|
||||
|
||||
$ <strong>dumppdf.py -r -i6 foo.pdf > pic.jpeg</strong>
|
||||
(extract a JPEG image)
|
||||
</pre></blockquote>
|
||||
|
||||
<h4>Options</h4>
|
||||
<dl>
|
||||
<dt> <code>-a</code>
|
||||
<dd> Instructs to dump all the objects.
|
||||
By default, it only prints the document trailer (like a header).
|
||||
<p>
|
||||
<dt> <code>-i <em>objno,objno, ...</em></code>
|
||||
<dd> Specifies PDF object IDs to display.
|
||||
Comma-separated IDs, or multiple <code>-i</code> options are accepted.
|
||||
<p>
|
||||
<dt> <code>-p <em>pageno,pageno, ...</em></code>
|
||||
<dd> Specifies the page number to be extracted.
|
||||
Comma-separated page numbers, or multiple <code>-p</code> options are accepted.
|
||||
Note that page numbers start at one, not zero.
|
||||
<p>
|
||||
<dt> <code>-r</code> (raw)
|
||||
<dt> <code>-b</code> (binary)
|
||||
<dt> <code>-t</code> (text)
|
||||
<dd> Specifies the output format of stream contents.
|
||||
Because the contents of stream objects can be very large,
|
||||
they are omitted when none of the options above is specified.
|
||||
<p>
|
||||
With <code>-r</code> option, the "raw" stream contents are dumped without decompression.
|
||||
With <code>-b</code> option, the decompressed contents are dumped as a binary blob.
|
||||
With <code>-t</code> option, the decompressed contents are dumped in a text format,
|
||||
similar to <code>repr()</code> manner. When
|
||||
<code>-r</code> or <code>-b</code> option is given,
|
||||
no stream header is displayed for the ease of saving it to a file.
|
||||
<p>
|
||||
<dt> <code>-T</code>
|
||||
<dd> Shows the table of contents.
|
||||
<p>
|
||||
<dt> <code>-E <em>directory</em></code>
|
||||
<dd> Extracts embedded files from the pdf into the given directory.
|
||||
<p>
|
||||
<dt> <code>-P <em>password</em></code>
|
||||
<dd> Provides the user password to access PDF contents.
|
||||
<p>
|
||||
<dt> <code>-d</code>
|
||||
<dd> Increases the debug level.
|
||||
</dl>
|
||||
|
||||
<h2><a name="changes">Changes:</a></h2>
|
||||
<ul>
|
||||
<li> 2014/09/15: pushed on PyPi</li>
|
||||
<li> 2014/09/10: pdfminer_six forked from pdfminer since Yusuke didn't want to merge and pdfminer3k is outdated</li>
|
||||
</ul>
|
||||
|
||||
<h2><a name="todo">TODO</a></h2>
|
||||
<ul>
|
||||
<li> <A href="http://www.python.org/dev/peps/pep-0008/">PEP-8</a> and
|
||||
<a href="http://www.python.org/dev/peps/pep-0257/">PEP-257</a> conformance.
|
||||
<li> Better documentation.
|
||||
<li> Better text extraction / layout analysis. (writing mode detection, Type1 font file analysis, etc.)
|
||||
<li> Crypt stream filter support. (More sample documents are needed!)
|
||||
</ul>
|
||||
|
||||
<h2><a name="related">Related Projects</a></h2>
|
||||
<ul>
|
||||
<li> <a href="http://pybrary.net/pyPdf/">pyPdf</a>
|
||||
<li> <a href="http://www.foolabs.com/xpdf/">xpdf</a>
|
||||
<li> <a href="http://www.pdfbox.org/">pdfbox</a>
|
||||
<li> <a href="http://mupdf.com/">mupdf</a>
|
||||
</ul>
|
||||
|
||||
<h2><a name="license">Terms and Conditions</a></h2>
|
||||
<p>
|
||||
(This is so-called MIT/X License)
|
||||
<p>
|
||||
<small>
|
||||
Copyright (c) 2004-2013 Yusuke Shinyama <yusuke at cs dot nyu dot edu>
|
||||
<p>
|
||||
Permission is hereby granted, free of charge, to any person
|
||||
obtaining a copy of this software and associated documentation
|
||||
files (the "Software"), to deal in the Software without
|
||||
restriction, including without limitation the rights to use,
|
||||
copy, modify, merge, publish, distribute, sublicense, and/or
|
||||
sell copies of the Software, and to permit persons to whom the
|
||||
Software is furnished to do so, subject to the following
|
||||
conditions:
|
||||
<p>
|
||||
The above copyright notice and this permission notice shall be
|
||||
included in all copies or substantial portions of the Software.
|
||||
<p>
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
|
||||
KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
|
||||
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
|
||||
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
|
||||
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
|
||||
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
||||
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
</small>
|
||||
|
||||
<hr noshade>
|
||||
<address>Yusuke Shinyama (yusuke at cs dot nyu dot edu)</address>
|
||||
</body>
|
391
docs/layout.obj
391
docs/layout.obj
|
@ -1,391 +0,0 @@
|
|||
%TGIF 4.2.2
|
||||
state(0,37,100.000,0,0,0,16,1,9,1,1,0,0,0,0,1,1,'Helvetica-Bold',1,69120,0,0,1,5,0,0,1,1,0,16,0,0,1,1,1,1,1050,1485,1,0,2880,0).
|
||||
%
|
||||
% @(#)$Header$
|
||||
% %W%
|
||||
%
|
||||
unit("1 pixel/pixel").
|
||||
color_info(19,65535,0,[
|
||||
"magenta", 65535, 0, 65535, 65535, 0, 65535, 1,
|
||||
"red", 65535, 0, 0, 65535, 0, 0, 1,
|
||||
"green", 0, 65535, 0, 0, 65535, 0, 1,
|
||||
"blue", 0, 0, 65535, 0, 0, 65535, 1,
|
||||
"yellow", 65535, 65535, 0, 65535, 65535, 0, 1,
|
||||
"pink", 65535, 49344, 52171, 65535, 49344, 52171, 1,
|
||||
"cyan", 0, 65535, 65535, 0, 65535, 65535, 1,
|
||||
"CadetBlue", 24415, 40606, 41120, 24415, 40606, 41120, 1,
|
||||
"white", 65535, 65535, 65535, 65535, 65535, 65535, 1,
|
||||
"black", 0, 0, 0, 0, 0, 0, 1,
|
||||
"DarkSlateGray", 12079, 20303, 20303, 12079, 20303, 20303, 1,
|
||||
"#00000000c000", 0, 0, 49344, 0, 0, 49152, 1,
|
||||
"#820782070000", 33410, 33410, 0, 33287, 33287, 0, 1,
|
||||
"#3cf3fbee34d2", 15420, 64507, 13364, 15603, 64494, 13522, 1,
|
||||
"#3cf3fbed34d3", 15420, 64507, 13364, 15603, 64493, 13523, 1,
|
||||
"#ffffa6990000", 65535, 42662, 0, 65535, 42649, 0, 1,
|
||||
"#ffff0000fffe", 65535, 0, 65535, 65535, 0, 65534, 1,
|
||||
"#fffe0000fffe", 65535, 0, 65535, 65534, 0, 65534, 1,
|
||||
"#fffe00000000", 65535, 0, 0, 65534, 0, 0, 1
|
||||
]).
|
||||
script_frac("0.6").
|
||||
fg_bg_colors('black','white').
|
||||
dont_reencode("FFDingbests:ZapfDingbats").
|
||||
objshadow_info('#c0c0c0',2,2).
|
||||
rotate_pivot(0,0,0,0).
|
||||
spline_tightness(1).
|
||||
page(1,"",1,'').
|
||||
box('black','',50,45,300,355,2,2,1,0,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
box('black','',75,75,195,225,2,1,1,10,8,0,0,0,0,'1',0,[
|
||||
]).
|
||||
box('black','',85,105,185,125,2,1,1,18,8,0,0,0,0,'1',0,[
|
||||
]).
|
||||
box('black','',85,105,105,125,2,1,1,19,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
box('black','',105,105,125,125,2,1,1,20,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
text('black',95,108,1,1,1,9,15,21,12,3,0,0,0,0,2,9,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(9,15,0,0,1,0,0,[
|
||||
mini_line(9,12,3,0,0,0,[
|
||||
str_block(0,9,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica',0,69120,9,12,3,0,-1,0,0,0,0,0,
|
||||
"A")])
|
||||
])
|
||||
])]).
|
||||
text('black',115,108,1,1,1,8,15,28,12,3,0,0,0,0,2,8,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(8,15,0,0,1,0,0,[
|
||||
mini_line(8,12,3,0,0,0,[
|
||||
str_block(0,8,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica',0,69120,8,12,3,0,-1,0,0,0,0,0,
|
||||
"B")])
|
||||
])
|
||||
])]).
|
||||
box('black','',125,105,145,125,0,1,1,32,0,0,0,0,0,'1',0,[
|
||||
]).
|
||||
text('black',135,108,1,1,1,9,15,36,12,3,0,0,0,0,2,9,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(9,15,0,0,1,0,0,[
|
||||
mini_line(9,12,3,0,0,0,[
|
||||
str_block(0,9,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica',0,69120,9,12,3,0,-1,0,0,0,0,0,
|
||||
"C")])
|
||||
])
|
||||
])]).
|
||||
poly('black','',2,[
|
||||
215,140,215,220],0,3,1,51,0,0,0,0,0,0,0,'3',0,0,
|
||||
"0","",[
|
||||
0,12,5,0,'12','5','0'],[0,12,5,0,'12','5','0'],[
|
||||
]).
|
||||
box('black','',175,265,270,325,0,3,1,65,0,0,0,0,0,'3',0,[
|
||||
]).
|
||||
box('black','',185,270,260,320,0,1,1,69,8,0,0,0,0,'1',0,[
|
||||
]).
|
||||
poly('black','',6,[
|
||||
195,295,215,290,235,310,245,285,225,300,195,295],0,2,1,74,0,0,0,0,0,0,0,'2',0,0,
|
||||
"00","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
box('black','',85,275,140,315,1,2,0,87,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',85,23,1,1,1,44,15,93,12,3,0,0,0,0,2,44,15,0,0,"",0,0,0,0,35,'',[
|
||||
minilines(44,15,0,0,1,0,0,[
|
||||
mini_line(44,12,3,0,0,0,[
|
||||
str_block(0,44,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,44,12,3,0,-1,0,0,0,0,0,
|
||||
"LTPage")])
|
||||
])
|
||||
])]).
|
||||
text('black',255,133,1,1,1,39,15,100,12,3,0,0,0,0,2,39,15,0,0,"",0,0,0,0,145,'',[
|
||||
minilines(39,15,0,0,1,0,0,[
|
||||
mini_line(39,12,3,0,0,0,[
|
||||
str_block(0,39,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,39,12,3,0,-1,0,0,0,0,0,
|
||||
"LTLine")])
|
||||
])
|
||||
])]).
|
||||
text('black',125,83,1,1,1,42,15,104,12,3,0,0,0,0,2,42,15,0,0,"",0,0,0,0,95,'',[
|
||||
minilines(42,15,0,0,1,0,0,[
|
||||
mini_line(42,12,3,0,0,0,[
|
||||
str_block(0,42,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,42,12,3,0,0,0,0,0,0,0,
|
||||
"LTChar")])
|
||||
])
|
||||
])]).
|
||||
text('black',245,53,1,1,1,65,15,108,12,3,0,0,0,0,2,65,15,0,0,"",0,0,0,0,65,'',[
|
||||
minilines(65,15,0,0,1,0,0,[
|
||||
mini_line(65,12,3,0,0,0,[
|
||||
str_block(0,65,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,65,12,3,0,-1,0,0,0,0,0,
|
||||
"LTTextBox")])
|
||||
])
|
||||
])]).
|
||||
text('black',245,88,1,1,1,66,15,110,12,3,0,0,0,0,2,66,15,0,0,"",0,0,0,0,100,'',[
|
||||
minilines(66,15,0,0,1,0,0,[
|
||||
mini_line(66,12,3,0,0,0,[
|
||||
str_block(0,66,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,66,12,3,0,-1,0,0,0,0,0,
|
||||
"LTTextLine")])
|
||||
])
|
||||
])]).
|
||||
text('black',255,243,1,1,1,51,15,112,12,3,0,0,0,0,2,51,15,0,0,"",0,0,0,0,255,'',[
|
||||
minilines(51,15,0,0,1,0,0,[
|
||||
mini_line(51,12,3,0,0,0,[
|
||||
str_block(0,51,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,51,12,3,0,-1,0,0,0,0,0,
|
||||
"LTFigure")])
|
||||
])
|
||||
])]).
|
||||
text('black',140,243,1,1,1,51,15,114,12,3,0,0,0,0,2,51,15,0,0,"",0,0,0,0,255,'',[
|
||||
minilines(51,15,0,0,1,0,0,[
|
||||
mini_line(51,12,3,0,0,0,[
|
||||
str_block(0,51,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,51,12,3,0,-1,0,0,0,0,0,
|
||||
"LTImage")])
|
||||
])
|
||||
])]).
|
||||
text('black',240,223,1,1,1,43,15,116,12,3,0,0,0,0,2,43,15,0,0,"",0,0,0,0,235,'',[
|
||||
minilines(43,15,0,0,1,0,0,[
|
||||
mini_line(43,12,3,0,0,0,[
|
||||
str_block(0,43,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,43,12,3,0,0,0,0,0,0,0,
|
||||
"LTRect")])
|
||||
])
|
||||
])]).
|
||||
text('black',190,333,1,1,1,50,15,118,12,3,0,0,0,0,2,50,15,0,0,"",0,0,0,0,345,'',[
|
||||
minilines(50,15,0,0,1,0,0,[
|
||||
mini_line(50,12,3,0,0,0,[
|
||||
str_block(0,50,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,50,12,3,0,-1,0,0,0,0,0,
|
||||
"LTCurve")])
|
||||
])
|
||||
])]).
|
||||
text('black',170,138,1,1,1,42,15,121,12,3,0,0,0,0,2,42,15,0,0,"",0,0,0,0,150,'',[
|
||||
minilines(42,15,0,0,1,0,0,[
|
||||
mini_line(42,12,3,0,0,0,[
|
||||
str_block(0,42,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,42,12,3,0,0,0,0,0,0,0,
|
||||
"LTText")])
|
||||
])
|
||||
])]).
|
||||
box('black','',145,105,165,125,0,1,1,125,8,0,0,0,0,'1',0,[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
105,95,95,110],0,1,1,135,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
165,140,155,115],0,1,1,138,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
215,65,190,80],0,1,1,139,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
215,100,180,115],0,1,1,140,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
235,140,215,150],0,1,1,141,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
220,235,205,265],0,1,1,146,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
235,255,225,275],0,1,1,147,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
195,330,220,300],0,1,1,148,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
125,255,110,280],0,1,1,149,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
text('black',610,33,1,1,1,44,15,151,12,3,0,0,0,0,2,44,15,0,0,"",0,0,0,0,45,'',[
|
||||
minilines(44,15,0,0,1,0,0,[
|
||||
mini_line(44,12,3,0,0,0,[
|
||||
str_block(0,44,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,44,12,3,0,-1,0,0,0,0,0,
|
||||
"LTPage")])
|
||||
])
|
||||
])]).
|
||||
text('black',460,108,1,1,1,65,15,152,12,3,0,0,0,0,2,65,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(65,15,0,0,1,0,0,[
|
||||
mini_line(65,12,3,0,0,0,[
|
||||
str_block(0,65,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,65,12,3,0,-1,0,0,0,0,0,
|
||||
"LTTextBox")])
|
||||
])
|
||||
])]).
|
||||
text('black',410,178,1,1,1,66,15,154,12,3,0,0,0,0,2,66,15,0,0,"",0,0,0,0,190,'',[
|
||||
minilines(66,15,0,0,1,0,0,[
|
||||
mini_line(66,12,3,0,0,0,[
|
||||
str_block(0,66,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,66,12,3,0,-1,0,0,0,0,0,
|
||||
"LTTextLine")])
|
||||
])
|
||||
])]).
|
||||
text('black',360,248,1,1,1,42,15,157,12,3,0,0,0,0,2,42,15,0,0,"",0,0,0,0,260,'',[
|
||||
minilines(42,15,0,0,1,0,0,[
|
||||
mini_line(42,12,3,0,0,0,[
|
||||
str_block(0,42,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,42,12,3,0,0,0,0,0,0,0,
|
||||
"LTChar")])
|
||||
])
|
||||
])]).
|
||||
text('black',420,248,1,1,1,42,15,159,12,3,0,0,0,0,2,42,15,0,0,"",0,0,0,0,260,'',[
|
||||
minilines(42,15,0,0,1,0,0,[
|
||||
mini_line(42,12,3,0,0,0,[
|
||||
str_block(0,42,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,42,12,3,0,0,0,0,0,0,0,
|
||||
"LTChar")])
|
||||
])
|
||||
])]).
|
||||
text('black',480,248,1,1,1,42,15,161,12,3,0,0,0,0,2,42,15,0,0,"",0,0,0,0,260,'',[
|
||||
minilines(42,15,0,0,1,0,0,[
|
||||
mini_line(42,12,3,0,0,0,[
|
||||
str_block(0,42,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,42,12,3,0,0,0,0,0,0,0,
|
||||
"LTText")])
|
||||
])
|
||||
])]).
|
||||
text('black',460,178,1,1,1,12,15,170,12,3,0,0,0,0,2,12,15,0,0,"",0,0,0,0,190,'',[
|
||||
minilines(12,15,0,0,1,0,0,[
|
||||
mini_line(12,12,3,0,0,0,[
|
||||
str_block(0,12,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,12,12,3,0,-1,0,0,0,0,0,
|
||||
"...")])
|
||||
])
|
||||
])]).
|
||||
text('black',520,248,1,1,1,12,15,172,12,3,0,0,0,0,2,12,15,0,0,"",0,0,0,0,260,'',[
|
||||
minilines(12,15,0,0,1,0,0,[
|
||||
mini_line(12,12,3,0,0,0,[
|
||||
str_block(0,12,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,12,12,3,0,-1,0,0,0,0,0,
|
||||
"...")])
|
||||
])
|
||||
])]).
|
||||
text('black',560,108,1,1,1,51,15,174,12,3,0,0,0,0,2,51,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(51,15,0,0,1,0,0,[
|
||||
mini_line(51,12,3,0,0,0,[
|
||||
str_block(0,51,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,51,12,3,0,-1,0,0,0,0,0,
|
||||
"LTFigure")])
|
||||
])
|
||||
])]).
|
||||
text('black',635,108,1,1,1,39,15,178,12,3,0,0,0,0,2,39,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(39,15,0,0,1,0,0,[
|
||||
mini_line(39,12,3,0,0,0,[
|
||||
str_block(0,39,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,39,12,3,0,-1,0,0,0,0,0,
|
||||
"LTLine")])
|
||||
])
|
||||
])]).
|
||||
text('black',700,108,1,1,1,43,15,180,12,3,0,0,0,0,2,43,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(43,15,0,0,1,0,0,[
|
||||
mini_line(43,12,3,0,0,0,[
|
||||
str_block(0,43,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,43,12,3,0,0,0,0,0,0,0,
|
||||
"LTRect")])
|
||||
])
|
||||
])]).
|
||||
text('black',580,178,1,1,1,50,15,182,12,3,0,0,0,0,2,50,15,0,0,"",0,0,0,0,190,'',[
|
||||
minilines(50,15,0,0,1,0,0,[
|
||||
mini_line(50,12,3,0,0,0,[
|
||||
str_block(0,50,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,50,12,3,0,-1,0,0,0,0,0,
|
||||
"LTCurve")])
|
||||
])
|
||||
])]).
|
||||
text('black',775,108,1,1,1,51,15,186,12,3,0,0,0,0,2,51,15,0,0,"",0,0,0,0,120,'',[
|
||||
minilines(51,15,0,0,1,0,0,[
|
||||
mini_line(51,12,3,0,0,0,[
|
||||
str_block(0,51,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,51,12,3,0,-1,0,0,0,0,0,
|
||||
"LTImage")])
|
||||
])
|
||||
])]).
|
||||
poly('black','',2,[
|
||||
475,105,590,50],0,1,1,190,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
560,110,595,50],0,1,1,191,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
635,105,600,50],0,1,1,192,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
610,50,700,100],0,1,1,193,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
765,100,630,50],0,1,1,194,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
460,125,425,175],0,1,1,196,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
560,125,570,175],0,1,1,197,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
415,195,370,245],0,1,1,198,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
415,195,420,245],0,1,1,199,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
415,195,475,245],0,1,1,200,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
470,125,485,175],0,1,1,206,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
420,195,510,220],0,1,1,207,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
565,125,635,175],0,1,1,208,0,0,0,0,0,0,0,'1',0,0,
|
||||
"0","",[
|
||||
0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[
|
||||
]).
|
||||
text('black',635,178,1,1,1,12,15,215,12,3,0,0,0,0,2,12,15,0,0,"",0,0,0,0,190,'',[
|
||||
minilines(12,15,0,0,1,0,0,[
|
||||
mini_line(12,12,3,0,0,0,[
|
||||
str_block(0,12,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,12,12,3,0,-1,0,0,0,0,0,
|
||||
"...")])
|
||||
])
|
||||
])]).
|
|
@ -0,0 +1,35 @@
|
|||
@ECHO OFF
|
||||
|
||||
pushd %~dp0
|
||||
|
||||
REM Command file for Sphinx documentation
|
||||
|
||||
if "%SPHINXBUILD%" == "" (
|
||||
set SPHINXBUILD=sphinx-build
|
||||
)
|
||||
set SOURCEDIR=source
|
||||
set BUILDDIR=build
|
||||
|
||||
if "%1" == "" goto help
|
||||
|
||||
%SPHINXBUILD% >NUL 2>NUL
|
||||
if errorlevel 9009 (
|
||||
echo.
|
||||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
|
||||
echo.installed, then set the SPHINXBUILD environment variable to point
|
||||
echo.to the full path of the 'sphinx-build' executable. Alternatively you
|
||||
echo.may add the Sphinx directory to PATH.
|
||||
echo.
|
||||
echo.If you don't have Sphinx installed, grab it from
|
||||
echo.http://sphinx-doc.org/
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
|
||||
goto end
|
||||
|
||||
:help
|
||||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
|
||||
|
||||
:end
|
||||
popd
|
187
docs/objrel.obj
187
docs/objrel.obj
|
@ -1,187 +0,0 @@
|
|||
%TGIF 4.2.2
|
||||
state(0,37,100.000,0,0,0,16,1,9,1,1,1,0,0,2,1,1,'Helvetica-Bold',1,69120,0,0,1,10,0,0,1,1,0,16,0,0,1,1,1,1,1050,1485,1,0,2880,0).
|
||||
%
|
||||
% @(#)$Header$
|
||||
% %W%
|
||||
%
|
||||
unit("1 pixel/pixel").
|
||||
color_info(19,65535,0,[
|
||||
"magenta", 65535, 0, 65535, 65535, 0, 65535, 1,
|
||||
"red", 65535, 0, 0, 65535, 0, 0, 1,
|
||||
"green", 0, 65535, 0, 0, 65535, 0, 1,
|
||||
"blue", 0, 0, 65535, 0, 0, 65535, 1,
|
||||
"yellow", 65535, 65535, 0, 65535, 65535, 0, 1,
|
||||
"pink", 65535, 49344, 52171, 65535, 49344, 52171, 1,
|
||||
"cyan", 0, 65535, 65535, 0, 65535, 65535, 1,
|
||||
"CadetBlue", 24415, 40606, 41120, 24415, 40606, 41120, 1,
|
||||
"white", 65535, 65535, 65535, 65535, 65535, 65535, 1,
|
||||
"black", 0, 0, 0, 0, 0, 0, 1,
|
||||
"DarkSlateGray", 12079, 20303, 20303, 12079, 20303, 20303, 1,
|
||||
"#00000000c000", 0, 0, 49344, 0, 0, 49152, 1,
|
||||
"#820782070000", 33410, 33410, 0, 33287, 33287, 0, 1,
|
||||
"#3cf3fbee34d2", 15420, 64507, 13364, 15603, 64494, 13522, 1,
|
||||
"#3cf3fbed34d3", 15420, 64507, 13364, 15603, 64493, 13523, 1,
|
||||
"#ffffa6990000", 65535, 42662, 0, 65535, 42649, 0, 1,
|
||||
"#ffff0000fffe", 65535, 0, 65535, 65535, 0, 65534, 1,
|
||||
"#fffe0000fffe", 65535, 0, 65535, 65534, 0, 65534, 1,
|
||||
"#fffe00000000", 65535, 0, 0, 65534, 0, 0, 1
|
||||
]).
|
||||
script_frac("0.6").
|
||||
fg_bg_colors('black','white').
|
||||
dont_reencode("FFDingbests:ZapfDingbats").
|
||||
objshadow_info('#c0c0c0',2,2).
|
||||
rotate_pivot(0,0,0,0).
|
||||
spline_tightness(1).
|
||||
page(1,"",1,'').
|
||||
oval('black','',350,380,450,430,2,2,1,88,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
270,270,350,230],1,2,1,54,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
270,280,350,320],1,2,1,55,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
box('black','',350,100,450,150,2,2,1,2,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',400,118,1,1,1,84,15,3,12,3,0,0,0,0,2,84,15,0,0,"",0,0,0,0,130,'',[
|
||||
minilines(84,15,0,0,1,0,0,[
|
||||
mini_line(84,12,3,0,0,0,[
|
||||
str_block(0,84,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,84,12,3,0,0,0,0,0,0,0,
|
||||
"PDFDocument")])
|
||||
])
|
||||
])]).
|
||||
box('black','',150,100,250,150,2,2,1,13,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',200,118,1,1,1,63,15,14,12,3,0,0,0,0,2,63,15,0,0,"",0,0,0,0,130,'',[
|
||||
minilines(63,15,0,0,1,0,0,[
|
||||
mini_line(63,12,3,0,0,0,[
|
||||
str_block(0,63,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,63,12,3,0,0,0,0,0,0,0,
|
||||
"PDFParser")])
|
||||
])
|
||||
])]).
|
||||
box('black','',350,200,450,250,2,2,1,20,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',400,218,1,1,1,88,15,21,12,3,0,0,0,0,2,88,15,0,0,"",0,0,0,0,230,'',[
|
||||
minilines(88,15,0,0,1,0,0,[
|
||||
mini_line(88,12,3,0,0,0,[
|
||||
str_block(0,88,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,88,12,3,0,0,0,0,0,0,0,
|
||||
"PDFInterpreter")])
|
||||
])
|
||||
])]).
|
||||
box('black','',350,300,450,350,2,2,1,23,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',400,318,1,1,1,65,15,24,12,3,0,0,0,0,2,65,15,0,0,"",0,0,0,0,330,'',[
|
||||
minilines(65,15,0,0,1,0,0,[
|
||||
mini_line(65,12,3,0,0,0,[
|
||||
str_block(0,65,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,65,12,3,0,-1,0,0,0,0,0,
|
||||
"PDFDevice")])
|
||||
])
|
||||
])]).
|
||||
box('black','',180,250,280,300,2,2,1,29,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',230,268,1,1,1,131,15,30,12,3,2,0,0,0,2,131,15,0,0,"",0,0,0,0,280,'',[
|
||||
minilines(131,15,0,0,1,0,0,[
|
||||
mini_line(131,12,3,0,0,0,[
|
||||
str_block(0,131,12,3,0,0,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,131,12,3,0,0,0,0,0,0,0,
|
||||
"PDFResourceManager")])
|
||||
])
|
||||
])]).
|
||||
poly('black','',2,[
|
||||
250,140,350,140],1,2,1,45,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
350,110,250,110],1,2,1,46,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
400,150,400,200],1,2,1,47,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
400,250,400,300],1,2,1,56,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
poly('black','',2,[
|
||||
400,350,400,380],0,2,1,65,0,0,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
text('black',400,388,3,1,1,44,41,71,12,3,0,-2,0,0,2,44,41,0,0,"",0,0,0,0,400,'',[
|
||||
minilines(44,41,0,0,1,-2,0,[
|
||||
mini_line(44,12,3,0,0,0,[
|
||||
str_block(0,44,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,44,12,3,0,-1,0,0,0,0,0,
|
||||
"Display")])
|
||||
]),
|
||||
mini_line(20,12,3,0,0,0,[
|
||||
str_block(0,20,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,20,12,3,0,-1,0,0,0,0,0,
|
||||
"File")])
|
||||
]),
|
||||
mini_line(23,12,3,0,0,0,[
|
||||
str_block(0,23,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,23,12,3,0,-1,0,0,0,0,0,
|
||||
"etc.")])
|
||||
])
|
||||
])]).
|
||||
text('black',300,88,1,1,1,92,15,79,12,3,0,0,0,0,2,92,15,0,0,"",0,0,0,0,100,'',[
|
||||
minilines(92,15,0,0,1,0,0,[
|
||||
mini_line(92,12,3,0,0,0,[
|
||||
str_block(0,92,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,92,12,3,0,-1,0,0,0,0,0,
|
||||
"request objects")])
|
||||
])
|
||||
])]).
|
||||
text('black',300,148,1,1,1,78,15,84,12,3,0,0,0,0,2,78,15,0,0,"",0,0,0,0,160,'',[
|
||||
minilines(78,15,0,0,1,0,0,[
|
||||
mini_line(78,12,3,0,0,0,[
|
||||
str_block(0,78,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,78,12,3,0,-1,0,0,0,0,0,
|
||||
"store objects")])
|
||||
])
|
||||
])]).
|
||||
oval('black','',20,100,120,150,2,2,1,106,0,0,0,0,0,'2',0,[
|
||||
]).
|
||||
text('black',70,118,1,1,1,46,15,107,12,3,0,0,0,0,2,46,15,0,0,"",0,0,0,0,130,'',[
|
||||
minilines(46,15,0,0,1,0,0,[
|
||||
mini_line(46,12,3,0,0,0,[
|
||||
str_block(0,46,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,46,12,3,0,-1,0,0,0,0,0,
|
||||
"PDF file")])
|
||||
])
|
||||
])]).
|
||||
poly('black','',2,[
|
||||
120,120,150,120],0,2,1,114,0,2,0,0,0,0,0,'2',0,0,
|
||||
"0","",[
|
||||
0,10,4,0,'10','4','0'],[0,10,4,0,'10','4','0'],[
|
||||
]).
|
||||
text('black',400,158,1,1,1,84,15,115,12,3,2,0,0,0,2,84,15,0,0,"",0,0,0,0,170,'',[
|
||||
minilines(84,15,0,0,1,0,0,[
|
||||
mini_line(84,12,3,0,0,0,[
|
||||
str_block(0,84,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,84,12,3,0,-1,0,0,0,0,0,
|
||||
"page contents")])
|
||||
])
|
||||
])]).
|
||||
text('black',400,258,1,1,1,129,15,119,12,3,2,0,0,0,2,129,15,0,0,"",0,0,0,0,270,'',[
|
||||
minilines(129,15,0,0,1,0,0,[
|
||||
mini_line(129,12,3,0,0,0,[
|
||||
str_block(0,129,12,3,0,-1,0,0,0,[
|
||||
str_seg('black','Helvetica-Bold',1,69120,129,12,3,0,-1,0,0,0,0,0,
|
||||
"rendering instructions")])
|
||||
])
|
||||
])]).
|
BIN
docs/objrel.png
BIN
docs/objrel.png
Binary file not shown.
Before Width: | Height: | Size: 2.0 KiB |
|
@ -1,223 +0,0 @@
|
|||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
|
||||
<html>
|
||||
<head>
|
||||
<link rel="stylesheet" type="text/css" href="style.css">
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||||
<title>Programming with PDFMiner</title>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div align=right class=lastmod>
|
||||
<!-- hhmts start -->
|
||||
Last Modified: Mon Mar 24 11:49:28 UTC 2014
|
||||
<!-- hhmts end -->
|
||||
</div>
|
||||
|
||||
<p>
|
||||
<a href="index.html">[Back to PDFMiner homepage]</a>
|
||||
|
||||
<h1>Programming with PDFMiner</h1>
|
||||
<p>
|
||||
This page explains how to use PDFMiner as a library
|
||||
from other applications.
|
||||
<ul>
|
||||
<li> <a href="#overview">Overview</a>
|
||||
<li> <a href="#basic">Basic Usage</a>
|
||||
<li> <a href="#layout">Performing Layout Analysis</a>
|
||||
<li> <a href="#tocextract">Obtaining Table of Contents</a>
|
||||
<li> <a href="#extend">Extending Functionality</a>
|
||||
</ul>
|
||||
|
||||
<h2><a name="overview">Overview</a></h2>
|
||||
<p>
|
||||
<strong>PDF is evil.</strong> Although it is called a PDF
|
||||
"document", it's nothing like Word or HTML document. PDF is more
|
||||
like a graphic representation. PDF contents are just a bunch of
|
||||
instructions that tell how to place the stuff at each exact
|
||||
position on a display or paper. In most cases, it has no logical
|
||||
structure such as sentences or paragraphs and it cannot adapt
|
||||
itself when the paper size changes. PDFMiner attempts to
|
||||
reconstruct some of those structures by guessing from its
|
||||
positioning, but there's nothing guaranteed to work. Ugly, I
|
||||
know. Again, PDF is evil.
|
||||
|
||||
<p>
|
||||
[More technical details about the internal structure of PDF:
|
||||
"How to Extract Text Contents from PDF Manually"
|
||||
<a href="http://www.youtube.com/watch?v=k34wRxaxA_c">(part 1)</a>
|
||||
<a href="http://www.youtube.com/watch?v=_A1M4OdNsiQ">(part 2)</a>
|
||||
<a href="http://www.youtube.com/watch?v=sfV_7cWPgZE">(part 3)</a>]
|
||||
|
||||
<p>
|
||||
Because a PDF file has such a big and complex structure,
|
||||
parsing a PDF file as a whole is time and memory consuming. However,
|
||||
not every part is needed for most PDF processing tasks. Therefore
|
||||
PDFMiner takes a strategy of lazy parsing, which is to parse the
|
||||
stuff only when it's necessary. To parse PDF files, you need to use at
|
||||
least two classes: <code>PDFParser</code> and <code>PDFDocument</code>.
|
||||
These two objects are associated with each other.
|
||||
<code>PDFParser</code> fetches data from a file,
|
||||
and <code>PDFDocument</code> stores it. You'll also need
|
||||
<code>PDFPageInterpreter</code> to process the page contents
|
||||
and <code>PDFDevice</code> to translate it to whatever you need.
|
||||
<code>PDFResourceManager</code> is used to store
|
||||
shared resources such as fonts or images.
|
||||
|
||||
<p>
|
||||
Figure 1 shows the relationship between the classes in PDFMiner.
|
||||
|
||||
<div align=center>
|
||||
<img src="objrel.png"><br>
|
||||
<small>Figure 1. Relationships between PDFMiner classes</small>
|
||||
</div>
|
||||
|
||||
<h2><a name="basic">Basic Usage</a></h2>
|
||||
<p>
|
||||
A typical way to parse a PDF file is the following:
|
||||
<blockquote><pre>
|
||||
from pdfminer.pdfparser import PDFParser
|
||||
from pdfminer.pdfdocument import PDFDocument
|
||||
from pdfminer.pdfpage import PDFPage
|
||||
from pdfminer.pdfpage import PDFTextExtractionNotAllowed
|
||||
from pdfminer.pdfinterp import PDFResourceManager
|
||||
from pdfminer.pdfinterp import PDFPageInterpreter
|
||||
from pdfminer.pdfdevice import PDFDevice
|
||||
|
||||
<span class="comment"># Open a PDF file.</span>
|
||||
fp = open('mypdf.pdf', 'rb')
|
||||
<span class="comment"># Create a PDF parser object associated with the file object.</span>
|
||||
parser = PDFParser(fp)
|
||||
<span class="comment"># Create a PDF document object that stores the document structure.</span>
|
||||
<span class="comment"># Supply the password for initialization.</span>
|
||||
document = PDFDocument(parser, password)
|
||||
<span class="comment"># Check if the document allows text extraction. If not, abort.</span>
|
||||
if not document.is_extractable:
|
||||
raise PDFTextExtractionNotAllowed
|
||||
<span class="comment"># Create a PDF resource manager object that stores shared resources.</span>
|
||||
rsrcmgr = PDFResourceManager()
|
||||
<span class="comment"># Create a PDF device object.</span>
|
||||
device = PDFDevice(rsrcmgr)
|
||||
<span class="comment"># Create a PDF interpreter object.</span>
|
||||
interpreter = PDFPageInterpreter(rsrcmgr, device)
|
||||
<span class="comment"># Process each page contained in the document.</span>
|
||||
for page in PDFPage.create_pages(document):
|
||||
interpreter.process_page(page)
|
||||
</pre></blockquote>
|
||||
|
||||
<h2><a name="layout">Performing Layout Analysis</a></h2>
|
||||
<p>
|
||||
Here is a typical way to use the layout analysis function:
|
||||
<blockquote><pre>
|
||||
from pdfminer.layout import LAParams
|
||||
from pdfminer.converter import PDFPageAggregator
|
||||
|
||||
<span class="comment"># Set parameters for analysis.</span>
|
||||
laparams = LAParams()
|
||||
<span class="comment"># Create a PDF page aggregator object.</span>
|
||||
device = PDFPageAggregator(rsrcmgr, laparams=laparams)
|
||||
interpreter = PDFPageInterpreter(rsrcmgr, device)
|
||||
for page in PDFPage.create_pages(document):
|
||||
interpreter.process_page(page)
|
||||
<span class="comment"># receive the LTPage object for the page.</span>
|
||||
layout = device.get_result()
|
||||
</pre></blockquote>
|
||||
|
||||
A layout analyzer returns a <code>LTPage</code> object for each page
|
||||
in the PDF document. This object contains child objects within the page,
|
||||
forming a tree structure. Figure 2 shows the relationship between
|
||||
these objects.
|
||||
|
||||
<div align=center>
|
||||
<img src="layout.png"><br>
|
||||
<small>Figure 2. Layout objects and its tree structure</small>
|
||||
</div>
|
||||
|
||||
<dl>
|
||||
<dt> <code>LTPage</code>
|
||||
<dd> Represents an entire page. May contain child objects like
|
||||
<code>LTTextBox</code>, <code>LTFigure</code>, <code>LTImage</code>, <code>LTRect</code>,
|
||||
<code>LTCurve</code> and <code>LTLine</code>.
|
||||
|
||||
<dt> <code>LTTextBox</code>
|
||||
<dd> Represents a group of text chunks that can be contained in a rectangular area.
|
||||
Note that this box is created by geometric analysis and does not necessarily
|
||||
represents a logical boundary of the text.
|
||||
It contains a list of <code>LTTextLine</code> objects.
|
||||
<code>get_text()</code> method returns the text content.
|
||||
|
||||
<dt> <code>LTTextLine</code>
|
||||
<dd> Contains a list of <code>LTChar</code> objects that represent
|
||||
a single text line. The characters are aligned either horizontaly
|
||||
or vertically, depending on the text's writing mode.
|
||||
<code>get_text()</code> method returns the text content.
|
||||
|
||||
<dt> <code>LTChar</code>
|
||||
<dt> <code>LTAnno</code>
|
||||
<dd> Represent an actual letter in the text as a Unicode string.
|
||||
Note that, while a <code>LTChar</code> object has actual boundaries,
|
||||
<code>LTAnno</code> objects does not, as these are "virtual" characters,
|
||||
inserted by a layout analyzer according to the relationship between two characters
|
||||
(e.g. a space).
|
||||
|
||||
<dt> <code>LTFigure</code>
|
||||
<dd> Represents an area used by PDF Form objects. PDF Forms can be used to
|
||||
present figures or pictures by embedding yet another PDF document within a page.
|
||||
Note that <code>LTFigure</code> objects can appear recursively.
|
||||
|
||||
<dt> <code>LTImage</code>
|
||||
<dd> Represents an image object. Embedded images can be
|
||||
in JPEG or other formats, but currently PDFMiner does not
|
||||
pay much attention to graphical objects.
|
||||
|
||||
<dt> <code>LTLine</code>
|
||||
<dd> Represents a single straight line.
|
||||
Could be used for separating text or figures.
|
||||
|
||||
<dt> <code>LTRect</code>
|
||||
<dd> Represents a rectangle.
|
||||
Could be used for framing another pictures or figures.
|
||||
|
||||
<dt> <code>LTCurve</code>
|
||||
<dd> Represents a generic Bezier curve.
|
||||
</dl>
|
||||
|
||||
<p>
|
||||
Also, check out <a href="http://denis.papathanasiou.org/archive/2010.08.04.post.pdf">a more complete example by Denis Papathanasiou(Extracting Text & Images from PDF Files)</a>.
|
||||
|
||||
<h2><a name="tocextract">Obtaining Table of Contents</a></h2>
|
||||
<p>
|
||||
PDFMiner provides functions to access the document's table of contents
|
||||
("Outlines").
|
||||
|
||||
<blockquote><pre>
|
||||
from pdfminer.pdfparser import PDFParser
|
||||
from pdfminer.pdfdocument import PDFDocument
|
||||
|
||||
<span class="comment"># Open a PDF document.</span>
|
||||
fp = open('mypdf.pdf', 'rb')
|
||||
parser = PDFParser(fp)
|
||||
document = PDFDocument(parser, password)
|
||||
|
||||
<span class="comment"># Get the outlines of the document.</span>
|
||||
outlines = document.get_outlines()
|
||||
for (level,title,dest,a,se) in outlines:
|
||||
print (level, title)
|
||||
</pre></blockquote>
|
||||
|
||||
<p>
|
||||
Some PDF documents use page numbers as destinations, while others
|
||||
use page numbers and the physical location within the page. Since
|
||||
PDF does not have a logical structure, and it does not provide a
|
||||
way to refer to any in-page object from the outside, there's no
|
||||
way to tell exactly which part of text these destinations are
|
||||
referring to.
|
||||
|
||||
<h2><a name="extend">Extending Functionality</a></h2>
|
||||
|
||||
<p>
|
||||
You can extend <code>PDFPageInterpreter</code> and <code>PDFDevice</code> class
|
||||
in order to process them differently / obtain other information.
|
||||
|
||||
<hr noshade>
|
||||
<address>Yusuke Shinyama</address>
|
||||
</body>
|
|
@ -0,0 +1 @@
|
|||
sphinx-argparse
|
|
@ -0,0 +1,28 @@
|
|||
<style>
|
||||
td {
|
||||
text-align: center;
|
||||
}
|
||||
</style>
|
||||
<table style="margin: 10px; padding: 10px;">
|
||||
<tr>
|
||||
<td style="text-align: right; border-right:1px red solid">→</td>
|
||||
<td colspan="4"
|
||||
style="text-align: left; border-left:1px red solid">← <em><font
|
||||
color="red">M</font></em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="border:1px solid"><code>Q u i</code></td>
|
||||
<td style="border:1px solid"><code>c k</code></td>
|
||||
<td width="10px"></td>
|
||||
<td style="border:1px solid"><code>b r o w n</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td colspan="2" style="text-align: right; border-right:1px green solid">
|
||||
→
|
||||
</td>
|
||||
<td></td>
|
||||
<td colspan="2"
|
||||
style="text-align: left; border-left:1px green solid">←
|
||||
<em><font color="green">W</font></em></td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,23 @@
|
|||
<style>
|
||||
.background-blue {
|
||||
background-color: lightblue;
|
||||
border: 2px solid lightblue;
|
||||
}
|
||||
</style>
|
||||
<table style="margin: 10px; padding: 10px;">
|
||||
<tr>
|
||||
<td style="border:1px solid; text-align: left">
|
||||
<code>
|
||||
Q u i c k b r o w n<br/> f o x
|
||||
</code>
|
||||
</td>
|
||||
<td class="background-blue" colspan="3"></td>
|
||||
</tr>
|
||||
<tr style="height: 10px;">
|
||||
<td class="background-blue" colspan="4"></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="background-blue" colspan="3"></td>
|
||||
<td style="border:1px solid"><code>j u m p s ...</code></td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,45 @@
|
|||
<style>
|
||||
td {
|
||||
text-align: center;
|
||||
}
|
||||
</style>
|
||||
<table style="margin: 10px; padding: 10px;">
|
||||
<tr>
|
||||
<td></td>
|
||||
<td></td>
|
||||
<td align=right style="border-bottom:1px blue solid">↓</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td colspan="2" style="border:1px solid"><code>Q u i c k b r o w
|
||||
n</code></td>
|
||||
<td></td>
|
||||
<td align=right style="border-bottom:1px blue solid">↓</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td>
|
||||
<td></td>
|
||||
<td align=center valign=center><em><font color="blue">
|
||||
L<sub>1</sub>
|
||||
</font></em></td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="border:1px solid;">
|
||||
<code>f o x</code>
|
||||
</td>
|
||||
<td>
|
||||
|
||||
</td>
|
||||
<td align=right style="border-top:1px blue solid">↑</td>
|
||||
<td align=center valign=center><em><font color="blue">
|
||||
L<sub>2</sub>
|
||||
</font></em></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td>
|
||||
<td></td>
|
||||
<td></td>
|
||||
<td align=right style="border-top:1px blue solid">↑</td>
|
||||
</tr>
|
||||
</table>
|
Before Width: | Height: | Size: 3.5 KiB After Width: | Height: | Size: 3.5 KiB |
|
@ -0,0 +1,25 @@
|
|||
.. _api_commandline:
|
||||
|
||||
|
||||
Command-line API
|
||||
****************
|
||||
|
||||
.. _api_pdf2txt:
|
||||
|
||||
pdf2txt.py
|
||||
==========
|
||||
|
||||
.. argparse::
|
||||
:module: tools.pdf2txt
|
||||
:func: maketheparser
|
||||
:prog: python tools/pdf2txt.py
|
||||
|
||||
.. _api_dumppdf:
|
||||
|
||||
dumppdf.py
|
||||
==========
|
||||
|
||||
.. argparse::
|
||||
:module: tools.dumppdf
|
||||
:func: create_parser
|
||||
:prog: python tools/dumppdf.py
|
|
@ -0,0 +1,20 @@
|
|||
.. _api_composable:
|
||||
|
||||
Composable API
|
||||
**************
|
||||
|
||||
.. _api_laparams:
|
||||
|
||||
LAParams
|
||||
========
|
||||
|
||||
.. currentmodule:: pdfminer.layout
|
||||
.. autoclass:: LAParams
|
||||
|
||||
Todo:
|
||||
=====
|
||||
|
||||
- `PDFDevice`
|
||||
- `TextConverter`
|
||||
- `PDFPageAggregator`
|
||||
- `PDFPageInterpreter`
|
|
@ -0,0 +1,21 @@
|
|||
.. _api_highlevel:
|
||||
|
||||
High-level functions API
|
||||
************************
|
||||
|
||||
.. _api_extract_text:
|
||||
|
||||
extract_text
|
||||
============
|
||||
|
||||
.. currentmodule:: pdfminer.high_level
|
||||
.. autofunction:: extract_text
|
||||
|
||||
|
||||
.. _api_extract_text_to_fp:
|
||||
|
||||
extract_text_to_fp
|
||||
==================
|
||||
|
||||
.. currentmodule:: pdfminer.high_level
|
||||
.. autofunction:: extract_text_to_fp
|
|
@ -0,0 +1,9 @@
|
|||
API documentation
|
||||
*****************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
commandline
|
||||
highlevel
|
||||
composable
|
|
@ -0,0 +1,61 @@
|
|||
# Configuration file for the Sphinx documentation builder.
|
||||
#
|
||||
# This file only contains a selection of the most common options. For a full
|
||||
# list see the documentation:
|
||||
# https://www.sphinx-doc.org/en/master/usage/configuration.html
|
||||
|
||||
# -- Path setup --------------------------------------------------------------
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
|
||||
import os
|
||||
import sys
|
||||
sys.path.insert(0, os.path.join(os.path.abspath(os.path.dirname(__file__)), '../../'))
|
||||
|
||||
|
||||
# -- Project information -----------------------------------------------------
|
||||
|
||||
project = 'pdfminer.six'
|
||||
copyright = '2019, Yusuke Shinyama, Philippe Guglielmetti & Pieter Marsman'
|
||||
author = 'Yusuke Shinyama, Philippe Guglielmetti & Pieter Marsman'
|
||||
|
||||
# The full version, including alpha/beta/rc tags
|
||||
release = '20191020'
|
||||
|
||||
|
||||
# -- General configuration ---------------------------------------------------
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = [
|
||||
'sphinxarg.ext',
|
||||
'sphinx.ext.autodoc',
|
||||
'sphinx.ext.doctest',
|
||||
]
|
||||
|
||||
# Root rst file
|
||||
master_doc = 'index'
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['_templates']
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
# This pattern also affects html_static_path and html_extra_path.
|
||||
exclude_patterns = []
|
||||
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
#
|
||||
html_theme = 'alabaster'
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = ['_static']
|
|
@ -0,0 +1,72 @@
|
|||
Welcome to pdfminer.six's documentation!
|
||||
****************************************
|
||||
|
||||
.. image:: https://travis-ci.org/pdfminer/pdfminer.six.svg?branch=master
|
||||
:target: https://travis-ci.org/pdfminer/pdfminer.six
|
||||
:alt: Travis-ci build badge
|
||||
|
||||
.. image:: https://img.shields.io/pypi/v/pdfminer.six.svg
|
||||
:target: https://pypi.python.org/pypi/pdfminer.six/
|
||||
:alt: PyPi version badge
|
||||
|
||||
.. image:: https://badges.gitter.im/pdfminer-six/Lobby.svg
|
||||
:target: https://gitter.im/pdfminer-six/Lobby?utm_source=badge&utm_medium
|
||||
:alt: gitter badge
|
||||
|
||||
|
||||
Pdfminer.six is a python package for extracting information from PDF documents.
|
||||
|
||||
Check out the source on `github <https://github.com/pdfminer/pdfminer.six>`_.
|
||||
|
||||
Content
|
||||
=======
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
tutorials/index
|
||||
topics/index
|
||||
api/index
|
||||
|
||||
|
||||
Features
|
||||
========
|
||||
|
||||
* Parse all objects from a PDF document into Python objects.
|
||||
* Analyze and group text in a human-readable way.
|
||||
* Extract text, images (JPG, JBIG2 and Bitmaps), table-of-contents, tagged
|
||||
contents and more.
|
||||
* Support for (almost all) features from the PDF-1.7 specification
|
||||
* Support for Chinese, Japanese and Korean CJK) languages as well as vertical
|
||||
writing.
|
||||
* Support for various font types (Type1, TrueType, Type3, and CID).
|
||||
* Support for basic encryption (RC4).
|
||||
|
||||
|
||||
Installation instructions
|
||||
=========================
|
||||
|
||||
Before using it, you must install it using Python 2.7 or newer.
|
||||
|
||||
::
|
||||
|
||||
$ pip install pdfminer.six
|
||||
|
||||
Note that Python 2.7 support is dropped at January, 2020.
|
||||
|
||||
Common use-cases
|
||||
----------------
|
||||
|
||||
* :ref:`tutorial_commandline` if you just want to extract text from a pdf once.
|
||||
* :ref:`tutorial_highlevel` if you want to integrate pdfminer.six with your
|
||||
Python code.
|
||||
* :ref:`tutorial_composable` when you want to tailor the behavior of
|
||||
pdfmine.six to your needs.
|
||||
|
||||
|
||||
Contributing
|
||||
============
|
||||
|
||||
We welcome any contributors to pdfminer.six! But, before doing anything, take
|
||||
a look at the `contribution guide
|
||||
<https://github.com/pdfminer/pdfminer.six/blob/master/CONTRIBUTING.md>`_.
|
|
@ -0,0 +1,132 @@
|
|||
.. _topic_pdf_to_text:
|
||||
|
||||
Converting a PDF file to text
|
||||
*****************************
|
||||
|
||||
Most PDF files look like they contain well structured text. But the reality is
|
||||
that a PDF file does not contain anything that resembles a paragraphs,
|
||||
sentences or even words. When it comes to text, a PDF file is only aware of
|
||||
the characters and their placement.
|
||||
|
||||
This makes extracting meaningful pieces of text from PDF's files difficult.
|
||||
The characters that compose a paragraph are no different from those that
|
||||
compose the table, the page footer or the description of a figure. Unlike
|
||||
other documents formats, like a `.txt` file or a word document, the PDF format
|
||||
does not contain a stream of text.
|
||||
|
||||
A PDF document does consists of a collection of objects that together describe
|
||||
the appearance of one or more pages, possibly accompanied by additional
|
||||
interactive elements and higher-level application data. A PDF file contains
|
||||
the objects making up a PDF document along with associated structural
|
||||
information, all represented as a single self-contained sequence of bytes. [1]_
|
||||
|
||||
Layout analysis algorithm
|
||||
=========================
|
||||
|
||||
PDFMiner attempts to reconstruct some of those structures by using heuristics
|
||||
on the positioning of characters. This works well for sentences and
|
||||
paragraphs because meaningful groups of nearby characters can be made.
|
||||
|
||||
The layout analysis consist of three different stages: it groups characters
|
||||
into words and lines, then it groups lines into boxes and finally it groups
|
||||
textboxes hierarchically. These stages are discussed in the following
|
||||
sections. The resulting output of the layout analysis is an ordered hierarchy
|
||||
of layout objects on a PDF page.
|
||||
|
||||
.. figure:: ../_static/layout_analysis_output.png
|
||||
:align: center
|
||||
|
||||
The output of the layout analysis is a hierarchy of layout objects.
|
||||
|
||||
|
||||
The output of the layout analysis heavily depends on a couple of parameters.
|
||||
All these parameters are part of the :ref:`api_laparams` class.
|
||||
|
||||
Grouping characters into words and lines
|
||||
----------------------------------------
|
||||
|
||||
The first step in going from characters to text is to group characters in a
|
||||
meaningful way. Each character has an x-coordinate and a y-coordinate for its
|
||||
bottom-left corner and upper-right corner, i.e. its bounding box. Pdfminer
|
||||
.six uses these bounding boxes to decide which characters belong together.
|
||||
|
||||
Characters that are both horizontally and vertically close are grouped. How
|
||||
close they should be is determined by the `char_margin` (M in figure) and the
|
||||
`line_overlap` (not in figure) parameter. The horizontal *distance* between the
|
||||
bounding boxes of two characters should be smaller that the `char_margin` and
|
||||
the vertical *overlap* between the bounding boxes should be smaller the the
|
||||
`line_overlap`.
|
||||
|
||||
|
||||
.. raw:: html
|
||||
:file: ../_static/layout_analysis.html
|
||||
|
||||
The values of `char_margin` and `line_overlap` are relative to the size of
|
||||
the bounding boxes of the characters. The `char_margin` is relative to the
|
||||
maximum width of either one of the bounding boxes, and the `line_overlap` is
|
||||
relative to the minimum height of either one of the bounding boxes.
|
||||
|
||||
Spaces need to be inserted between characters because the PDF format has no
|
||||
notion of the space character. A space is inserted if the characters are
|
||||
further apart that the `word_margin` (W in the figure). The `word_margin` is
|
||||
relative to the maximum width or height of the new character. Having a larger
|
||||
`word_margin` creates smaller words and inserts spaces between characters
|
||||
more often. Note that the `word_margin` should be smaller than the
|
||||
`char_margin` otherwise all the characters are seperated by a space.
|
||||
|
||||
The result of this stage is a list of lines. Each line consists a list of
|
||||
characters. These characters either original `LTChar` characters that
|
||||
originate from the PDF file, or inserted `LTAnno` characters that
|
||||
represent spaces between words or newlines at the end of each line.
|
||||
|
||||
Grouping lines into boxes
|
||||
-------------------------
|
||||
|
||||
The second step is grouping lines in a meaningful way. Each line has a
|
||||
bounding box that is determined by the bounding boxes of the characters that
|
||||
it contains. Like grouping characters, pdfminer.six uses the bounding boxes
|
||||
to group the lines.
|
||||
|
||||
Lines that are both horizontally overlapping and vertically close are grouped.
|
||||
How vertically close the lines should be is determined by the `line_margin`.
|
||||
This margin is specified relative to the height of the bounding box. Lines
|
||||
are close if the gap between the tops (see L :sub:`1` in the figure) and bottoms
|
||||
(see L :sub:`2`) in the figure) of the bounding boxes are closer together
|
||||
than the absolute line margin, i.e. the `line_margin` multiplied by the
|
||||
height of the bounding box.
|
||||
|
||||
.. raw:: html
|
||||
:file: ../_static/layout_analysis_group_lines.html
|
||||
|
||||
The result of this stage is a list of text boxes. Each box consist of a list
|
||||
of lines.
|
||||
|
||||
Grouping textboxes hierarchically
|
||||
---------------------------------
|
||||
|
||||
the last step is to group the text boxes in a meaningful way. This step
|
||||
repeatedly merges the two text boxes that are closest to each other.
|
||||
|
||||
The closeness of bounding boxes is computed as the area that is between the
|
||||
two text boxes (the blue area in the figure). In other words, it is the area of
|
||||
the bounding box that surrounds both lines, minus the area of the bounding
|
||||
boxes of the individual lines.
|
||||
|
||||
.. raw:: html
|
||||
:file: ../_static/layout_analysis_group_boxes.html
|
||||
|
||||
|
||||
Working with rotated characters
|
||||
===============================
|
||||
|
||||
The algorithm described above assumes that all characters have the same
|
||||
orientation. However, any writing direction is possible in a PDF. To
|
||||
accommodate for this, pdfminer.six allows to detect vertical writing with the
|
||||
`detect_vertical` parameter. This will apply all the grouping steps as if the
|
||||
pdf was rotated 90 (or 270) degrees
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] Adobe System Inc. (2007). *Pdf reference: Adobe portable document
|
||||
format, version 1.7.*
|
|
@ -0,0 +1,7 @@
|
|||
Using pdfminer.six
|
||||
******************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
converting_pdf_to_text
|
|
@ -0,0 +1,41 @@
|
|||
.. _tutorial_commandline:
|
||||
|
||||
Get started with command-line tools
|
||||
***********************************
|
||||
|
||||
pdfminer.six has several tools that can be used from the command line. The
|
||||
command-line tools are aimed at users that occasionally want to extract text
|
||||
from a pdf.
|
||||
|
||||
Take a look at the high-level or composable interface if you want to use
|
||||
pdfminer.six programmatically.
|
||||
|
||||
Examples
|
||||
========
|
||||
|
||||
pdf2txt.py
|
||||
----------
|
||||
|
||||
::
|
||||
|
||||
$ python tools/pdf2txt.py example.pdf
|
||||
all the text from the pdf appears on the command line
|
||||
|
||||
The :ref:`api_pdf2txt` tool extracts all the text from a PDF. It uses layout
|
||||
analysis with sensible defaults to order and group the text in a sensible way.
|
||||
|
||||
dumppdf.py
|
||||
----------
|
||||
|
||||
::
|
||||
|
||||
$ python tools/dumppdf.py -a example.pdf
|
||||
<pdf><object id="1">
|
||||
...
|
||||
</object>
|
||||
...
|
||||
</pdf>
|
||||
|
||||
The :ref:`api_dumppdf` tool can be used to extract the internal structure from a
|
||||
PDF. This tool is primarily for debugging purposes, but that can be useful to
|
||||
anybody working with PDF's.
|
|
@ -0,0 +1,33 @@
|
|||
.. _tutorial_composable:
|
||||
|
||||
Get started using the composable components API
|
||||
***********************************************
|
||||
|
||||
The command line tools and the high-level API are just shortcuts for often
|
||||
used combinations of pdfminer.six components. You can use these components to
|
||||
modify pdfminer.six to your own needs.
|
||||
|
||||
For example, to extract the text from a PDF file and save it in a python
|
||||
variable::
|
||||
|
||||
from io import StringIO
|
||||
|
||||
from pdfminer.converter import TextConverter
|
||||
from pdfminer.layout import LAParams
|
||||
from pdfminer.pdfdocument import PDFDocument
|
||||
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
|
||||
from pdfminer.pdfpage import PDFPage
|
||||
from pdfminer.pdfparser import PDFParser
|
||||
|
||||
output_string = StringIO()
|
||||
with open('samples/simple1.pdf', 'rb') as in_file:
|
||||
parser = PDFParser(in_file)
|
||||
doc = PDFDocument(parser)
|
||||
rsrcmgr = PDFResourceManager()
|
||||
device = TextConverter(rsrcmgr, output_string, laparams=LAParams())
|
||||
interpreter = PDFPageInterpreter(rsrcmgr, device)
|
||||
for page in PDFPage.create_pages(doc):
|
||||
interpreter.process_page(page)
|
||||
|
||||
print(output_string.getvalue())
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
.. testsetup::
|
||||
|
||||
import sys
|
||||
from pdfminer.high_level import extract_text_to_fp, extract_text
|
||||
|
||||
.. _tutorial_highlevel:
|
||||
|
||||
Get started using the high-level functions
|
||||
******************************************
|
||||
|
||||
The high-level API can be used to do common tasks.
|
||||
|
||||
The most simple way to extract text from a PDF is to use
|
||||
:ref:`api_extract_text`:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> text = extract_text('samples/simple1.pdf')
|
||||
>>> print(repr(text))
|
||||
'Hello \n\nWorld\n\nWorld\n\nHello \n\nH e l l o \n\nH e l l o \n\nW o r l d\n\nW o r l d\n\n\x0c'
|
||||
>>> print(text)
|
||||
... # doctest: +NORMALIZE_WHITESPACE
|
||||
Hello
|
||||
<BLANKLINE>
|
||||
World
|
||||
<BLANKLINE>
|
||||
World
|
||||
<BLANKLINE>
|
||||
Hello
|
||||
<BLANKLINE>
|
||||
H e l l o
|
||||
<BLANKLINE>
|
||||
H e l l o
|
||||
<BLANKLINE>
|
||||
W o r l d
|
||||
<BLANKLINE>
|
||||
W o r l d
|
||||
<BLANKLINE>
|
||||
|
||||
|
||||
To read text from a PDF and print it on the command line:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> if sys.version_info > (3, 0):
|
||||
... from io import StringIO
|
||||
... else:
|
||||
... from io import BytesIO as StringIO
|
||||
>>> output_string = StringIO()
|
||||
>>> with open('samples/simple1.pdf', 'rb') as fin:
|
||||
... extract_text_to_fp(fin, output_string)
|
||||
>>> print(output_string.getvalue().strip())
|
||||
Hello WorldHello WorldHello WorldHello World
|
||||
|
||||
Or to convert it to html and use layout analysis:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> if sys.version_info > (3, 0):
|
||||
... from io import StringIO
|
||||
... else:
|
||||
... from io import BytesIO as StringIO
|
||||
>>> from pdfminer.layout import LAParams
|
||||
>>> output_string = StringIO()
|
||||
>>> with open('samples/simple1.pdf', 'rb') as fin:
|
||||
... extract_text_to_fp(fin, output_string, laparams=LAParams(),
|
||||
... output_type='html', codec=None)
|
|
@ -0,0 +1,9 @@
|
|||
Getting started
|
||||
***************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
commandline
|
||||
highlevel
|
||||
composable
|
|
@ -1,4 +0,0 @@
|
|||
blockquote { background: #eeeeee; }
|
||||
h1 { border-bottom: solid black 2px; }
|
||||
h2 { border-bottom: solid black 1px; }
|
||||
.comment { color: darkgreen; }
|
|
@ -13,7 +13,7 @@ other purposes instead of text analysis.
|
|||
import sys
|
||||
import warnings
|
||||
|
||||
__version__ = '20191020'
|
||||
__version__ = '20191107'
|
||||
|
||||
|
||||
if sys.version_info < (3, 0):
|
||||
|
|
|
@ -2,6 +2,7 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
import logging
|
||||
import re
|
||||
import sys
|
||||
from .pdfdevice import PDFTextDevice
|
||||
from .pdffont import PDFUnicodeNotDefined
|
||||
from .layout import LTContainer
|
||||
|
@ -271,6 +272,8 @@ class HTMLConverter(PDFConverter):
|
|||
def write(self, text):
|
||||
if self.codec:
|
||||
text = text.encode(self.codec)
|
||||
if sys.version_info < (3, 0):
|
||||
text = str(text)
|
||||
self.outfp.write(text)
|
||||
return
|
||||
|
||||
|
|
|
@ -1,49 +1,64 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Functions that encapsulate "usual" use-cases for pdfminer, for use making
|
||||
bundled scripts and for using pdfminer as a module for routine tasks.
|
||||
"""
|
||||
"""Functions that can be used for the most common use-cases for pdfminer.six"""
|
||||
|
||||
import six
|
||||
import logging
|
||||
import sys
|
||||
|
||||
from .pdfdocument import PDFDocument
|
||||
from .pdfparser import PDFParser
|
||||
import six
|
||||
|
||||
# Conditional import because python 2 is stupid
|
||||
if sys.version_info > (3, 0):
|
||||
from io import StringIO
|
||||
else:
|
||||
from io import BytesIO as StringIO
|
||||
|
||||
from .pdfinterp import PDFResourceManager, PDFPageInterpreter
|
||||
from .pdfdevice import PDFDevice, TagExtractor
|
||||
from .pdfdevice import TagExtractor
|
||||
from .pdfpage import PDFPage
|
||||
from .converter import XMLConverter, HTMLConverter, TextConverter
|
||||
from .cmapdb import CMapDB
|
||||
from .image import ImageWriter
|
||||
from .layout import LAParams
|
||||
|
||||
|
||||
def extract_text_to_fp(inf, outfp,
|
||||
_py2_no_more_posargs=None, # Bloody Python2 needs a shim
|
||||
output_type='text', codec='utf-8', laparams = None,
|
||||
maxpages=0, page_numbers=None, password="", scale=1.0, rotation=0,
|
||||
layoutmode='normal', output_dir=None, strip_control=False,
|
||||
debug=False, disable_caching=False, **other):
|
||||
debug=False, disable_caching=False, **kwargs):
|
||||
"""
|
||||
Parses text from inf-file and writes to outfp file-like object.
|
||||
Takes loads of optional arguments but the defaults are somewhat sane.
|
||||
Beware laparams: Including an empty LAParams is not the same as passing None!
|
||||
Returns nothing, acting as it does on two streams. Use StringIO to get strings.
|
||||
|
||||
output_type: May be 'text', 'xml', 'html', 'tag'. Only 'text' works properly.
|
||||
codec: Text decoding codec
|
||||
laparams: An LAParams object from pdfminer.layout.
|
||||
Default is None but may not layout correctly.
|
||||
maxpages: How many pages to stop parsing after
|
||||
page_numbers: zero-indexed page numbers to operate on.
|
||||
password: For encrypted PDFs, the password to decrypt.
|
||||
scale: Scale factor
|
||||
rotation: Rotation factor
|
||||
layoutmode: Default is 'normal', see pdfminer.converter.HTMLConverter
|
||||
output_dir: If given, creates an ImageWriter for extracted images.
|
||||
strip_control: Does what it says on the tin
|
||||
debug: Output more logging data
|
||||
disable_caching: Does what it says on the tin
|
||||
:param inf: a file-like object to read PDF structure from, such as a
|
||||
file handler (using the builtin `open()` function) or a `BytesIO`.
|
||||
:param outfp: a file-like object to write the text to.
|
||||
:param output_type: May be 'text', 'xml', 'html', 'tag'. Only 'text' works properly.
|
||||
:param codec: Text decoding codec
|
||||
:param laparams: An LAParams object from pdfminer.layout. Default is None but may not layout correctly.
|
||||
:param maxpages: How many pages to stop parsing after
|
||||
:param page_numbers: zero-indexed page numbers to operate on.
|
||||
:param password: For encrypted PDFs, the password to decrypt.
|
||||
:param scale: Scale factor
|
||||
:param rotation: Rotation factor
|
||||
:param layoutmode: Default is 'normal', see pdfminer.converter.HTMLConverter
|
||||
:param output_dir: If given, creates an ImageWriter for extracted images.
|
||||
:param strip_control: Does what it says on the tin
|
||||
:param debug: Output more logging data
|
||||
:param disable_caching: Does what it says on the tin
|
||||
:param other:
|
||||
:return:
|
||||
"""
|
||||
if '_py2_no_more_posargs' in kwargs is not None:
|
||||
raise DeprecationWarning(
|
||||
'The `_py2_no_more_posargs will be removed on January, 2020. At '
|
||||
'that moment pdfminer.six will stop supporting Python 2. Please '
|
||||
'upgrade to Python 3. For more information see '
|
||||
'https://github.com/pdfminer/pdfminer .six/issues/194')
|
||||
|
||||
if debug:
|
||||
logging.getLogger().setLevel(logging.DEBUG)
|
||||
|
||||
if six.PY2 and sys.stdin.encoding:
|
||||
password = password.decode(sys.stdin.encoding)
|
||||
|
||||
|
@ -82,3 +97,41 @@ def extract_text_to_fp(inf, outfp,
|
|||
interpreter.process_page(page)
|
||||
|
||||
device.close()
|
||||
|
||||
|
||||
def extract_text(pdf_file, password='', page_numbers=None, maxpages=0,
|
||||
caching=True, codec='utf-8', laparams=None):
|
||||
"""
|
||||
Parses and returns the text contained in a PDF file.
|
||||
Takes loads of optional arguments but the defaults are somewhat sane.
|
||||
Returns a string containing all of the text extracted.
|
||||
|
||||
:param pdf_file: Path to the PDF file to be worked on
|
||||
:param password: For encrypted PDFs, the password to decrypt.
|
||||
:param page_numbers: List of zero-indexed page numbers to extract.
|
||||
:param maxpages: The maximum number of pages to parse
|
||||
:param caching: If resources should be cached
|
||||
:param codec: Text decoding codec
|
||||
:param laparams: LAParams object from pdfminer.layout.
|
||||
"""
|
||||
if laparams is None:
|
||||
laparams = LAParams()
|
||||
|
||||
with open(pdf_file, "rb") as fp, StringIO() as output_string:
|
||||
rsrcmgr = PDFResourceManager()
|
||||
device = TextConverter(rsrcmgr, output_string, codec=codec,
|
||||
laparams=laparams)
|
||||
interpreter = PDFPageInterpreter(rsrcmgr, device)
|
||||
|
||||
for page in PDFPage.get_pages(
|
||||
fp,
|
||||
page_numbers,
|
||||
maxpages=maxpages,
|
||||
password=password,
|
||||
caching=caching,
|
||||
check_extractable=True,
|
||||
):
|
||||
interpreter.process_page(page)
|
||||
|
||||
return output_string.getvalue()
|
||||
|
||||
|
|
|
@ -1,12 +1,14 @@
|
|||
|
||||
import struct
|
||||
import os
|
||||
import os.path
|
||||
import struct
|
||||
from io import BytesIO
|
||||
from .pdftypes import LITERALS_DCT_DECODE
|
||||
|
||||
from .jbig2 import JBIG2StreamReader, JBIG2StreamWriter
|
||||
from .pdfcolor import LITERAL_DEVICE_CMYK
|
||||
from .pdfcolor import LITERAL_DEVICE_GRAY
|
||||
from .pdfcolor import LITERAL_DEVICE_RGB
|
||||
from .pdfcolor import LITERAL_DEVICE_CMYK
|
||||
from .pdftypes import LITERALS_DCT_DECODE, LITERALS_JBIG2_DECODE
|
||||
|
||||
|
||||
def align32(x):
|
||||
|
@ -57,9 +59,11 @@ class BMPWriter(object):
|
|||
return
|
||||
|
||||
|
||||
## ImageWriter
|
||||
##
|
||||
class ImageWriter(object):
|
||||
"""Write image to a file
|
||||
|
||||
Supports various image types: JPEG, JBIG2 and bitmaps
|
||||
"""
|
||||
|
||||
def __init__(self, outdir):
|
||||
self.outdir = outdir
|
||||
|
@ -68,21 +72,15 @@ class ImageWriter(object):
|
|||
return
|
||||
|
||||
def export_image(self, image):
|
||||
stream = image.stream
|
||||
filters = stream.get_filters()
|
||||
(width, height) = image.srcsize
|
||||
if len(filters) == 1 and filters[0][0] in LITERALS_DCT_DECODE:
|
||||
ext = '.jpg'
|
||||
elif (image.bits == 1 or
|
||||
image.bits == 8 and (LITERAL_DEVICE_RGB in image.colorspace or LITERAL_DEVICE_GRAY in image.colorspace)):
|
||||
ext = '.%dx%d.bmp' % (width, height)
|
||||
else:
|
||||
ext = '.%d.%dx%d.img' % (image.bits, width, height)
|
||||
name = image.name+ext
|
||||
path = os.path.join(self.outdir, name)
|
||||
|
||||
is_jbig2 = self.is_jbig2_image(image)
|
||||
ext = self._get_image_extension(image, width, height, is_jbig2)
|
||||
name, path = self._create_unique_image_name(self.outdir, image.name, ext)
|
||||
|
||||
fp = open(path, 'wb')
|
||||
if ext == '.jpg':
|
||||
raw_data = stream.get_rawdata()
|
||||
raw_data = image.stream.get_rawdata()
|
||||
if LITERAL_DEVICE_CMYK in image.colorspace:
|
||||
from PIL import Image
|
||||
from PIL import ImageChops
|
||||
|
@ -93,9 +91,18 @@ class ImageWriter(object):
|
|||
i.save(fp, 'JPEG')
|
||||
else:
|
||||
fp.write(raw_data)
|
||||
elif is_jbig2:
|
||||
input_stream = BytesIO()
|
||||
input_stream.write(image.stream.get_data())
|
||||
input_stream.seek(0)
|
||||
reader = JBIG2StreamReader(input_stream)
|
||||
segments = reader.get_segments()
|
||||
|
||||
writer = JBIG2StreamWriter(fp)
|
||||
writer.write_file(segments)
|
||||
elif image.bits == 1:
|
||||
bmp = BMPWriter(fp, 1, width, height)
|
||||
data = stream.get_data()
|
||||
data = image.stream.get_data()
|
||||
i = 0
|
||||
width = (width+7)//8
|
||||
for y in range(height):
|
||||
|
@ -103,7 +110,7 @@ class ImageWriter(object):
|
|||
i += width
|
||||
elif image.bits == 8 and LITERAL_DEVICE_RGB in image.colorspace:
|
||||
bmp = BMPWriter(fp, 24, width, height)
|
||||
data = stream.get_data()
|
||||
data = image.stream.get_data()
|
||||
i = 0
|
||||
width = width*3
|
||||
for y in range(height):
|
||||
|
@ -111,12 +118,47 @@ class ImageWriter(object):
|
|||
i += width
|
||||
elif image.bits == 8 and LITERAL_DEVICE_GRAY in image.colorspace:
|
||||
bmp = BMPWriter(fp, 8, width, height)
|
||||
data = stream.get_data()
|
||||
data = image.stream.get_data()
|
||||
i = 0
|
||||
for y in range(height):
|
||||
bmp.write_line(y, data[i:i+width])
|
||||
i += width
|
||||
else:
|
||||
fp.write(stream.get_data())
|
||||
fp.write(image.stream.get_data())
|
||||
fp.close()
|
||||
return name
|
||||
|
||||
@staticmethod
|
||||
def is_jbig2_image(image):
|
||||
filters = image.stream.get_filters()
|
||||
is_jbig2 = False
|
||||
for filter_name, params in filters:
|
||||
if filter_name in LITERALS_JBIG2_DECODE:
|
||||
is_jbig2 = True
|
||||
break
|
||||
return is_jbig2
|
||||
|
||||
@staticmethod
|
||||
def _get_image_extension(image, width, height, is_jbig2):
|
||||
filters = image.stream.get_filters()
|
||||
if len(filters) == 1 and filters[0][0] in LITERALS_DCT_DECODE:
|
||||
ext = '.jpg'
|
||||
elif is_jbig2:
|
||||
ext = '.jb2'
|
||||
elif (image.bits == 1 or
|
||||
image.bits == 8 and (LITERAL_DEVICE_RGB in image.colorspace or LITERAL_DEVICE_GRAY in image.colorspace)):
|
||||
ext = '.%dx%d.bmp' % (width, height)
|
||||
else:
|
||||
ext = '.%d.%dx%d.img' % (image.bits, width, height)
|
||||
return ext
|
||||
|
||||
@staticmethod
|
||||
def _create_unique_image_name(dirname, image_name, ext):
|
||||
name = image_name + ext
|
||||
path = os.path.join(dirname, name)
|
||||
img_index = 0
|
||||
while os.path.exists(path):
|
||||
name = '%s.%d%s' % (image_name, img_index, ext)
|
||||
path = os.path.join(dirname, name)
|
||||
img_index += 1
|
||||
return name, path
|
||||
|
|
|
@ -0,0 +1,321 @@
|
|||
import math
|
||||
import os
|
||||
from struct import pack, unpack, calcsize
|
||||
|
||||
# segment structure base
|
||||
SEG_STRUCT = [
|
||||
(">L", "number"),
|
||||
(">B", "flags"),
|
||||
(">B", "retention_flags"),
|
||||
(">B", "page_assoc"),
|
||||
(">L", "data_length"),
|
||||
]
|
||||
|
||||
# segment header literals
|
||||
HEADER_FLAG_DEFERRED = 0b10000000
|
||||
HEADER_FLAG_PAGE_ASSOC_LONG = 0b01000000
|
||||
|
||||
SEG_TYPE_MASK = 0b00111111
|
||||
|
||||
REF_COUNT_SHORT_MASK = 0b11100000
|
||||
REF_COUNT_LONG_MASK = 0x1fffffff
|
||||
REF_COUNT_LONG = 7
|
||||
|
||||
DATA_LEN_UNKNOWN = 0xffffffff
|
||||
|
||||
# segment types
|
||||
SEG_TYPE_IMMEDIATE_GEN_REGION = 38
|
||||
SEG_TYPE_END_OF_PAGE = 49
|
||||
SEG_TYPE_END_OF_FILE = 50
|
||||
|
||||
# file literals
|
||||
FILE_HEADER_ID = b'\x97\x4A\x42\x32\x0D\x0A\x1A\x0A'
|
||||
FILE_HEAD_FLAG_SEQUENTIAL = 0b00000001
|
||||
FILE_HEAD_FLAG_PAGES_UNKNOWN = 0b00000010
|
||||
|
||||
|
||||
def bit_set(bit_pos, value):
|
||||
return bool((value >> bit_pos) & 1)
|
||||
|
||||
|
||||
def check_flag(flag, value):
|
||||
return bool(flag & value)
|
||||
|
||||
|
||||
def masked_value(mask, value):
|
||||
for bit_pos in range(0, 31):
|
||||
if bit_set(bit_pos, mask):
|
||||
return (value & mask) >> bit_pos
|
||||
|
||||
raise Exception("Invalid mask or value")
|
||||
|
||||
|
||||
def mask_value(mask, value):
|
||||
for bit_pos in range(0, 31):
|
||||
if bit_set(bit_pos, mask):
|
||||
return (value & (mask >> bit_pos)) << bit_pos
|
||||
|
||||
raise Exception("Invalid mask or value")
|
||||
|
||||
|
||||
class JBIG2StreamReader(object):
|
||||
"""Read segments from a JBIG2 byte stream"""
|
||||
|
||||
def __init__(self, stream):
|
||||
self.stream = stream
|
||||
|
||||
def get_segments(self):
|
||||
segments = []
|
||||
while not self.is_eof():
|
||||
segment = {}
|
||||
for field_format, name in SEG_STRUCT:
|
||||
field_len = calcsize(field_format)
|
||||
field = self.stream.read(field_len)
|
||||
if len(field) < field_len:
|
||||
segment["_error"] = True
|
||||
break
|
||||
value = unpack(field_format, field)
|
||||
if len(value) == 1:
|
||||
[value] = value
|
||||
parser = getattr(self, "parse_%s" % name, None)
|
||||
if callable(parser):
|
||||
value = parser(segment, value, field)
|
||||
segment[name] = value
|
||||
|
||||
if not segment.get("_error"):
|
||||
segments.append(segment)
|
||||
return segments
|
||||
|
||||
def is_eof(self):
|
||||
if self.stream.read(1) == b'':
|
||||
return True
|
||||
else:
|
||||
self.stream.seek(-1, os.SEEK_CUR)
|
||||
return False
|
||||
|
||||
def parse_flags(self, segment, flags, field):
|
||||
return {
|
||||
"deferred": check_flag(HEADER_FLAG_DEFERRED, flags),
|
||||
"page_assoc_long": check_flag(HEADER_FLAG_PAGE_ASSOC_LONG, flags),
|
||||
"type": masked_value(SEG_TYPE_MASK, flags)
|
||||
}
|
||||
|
||||
def parse_retention_flags(self, segment, flags, field):
|
||||
ref_count = masked_value(REF_COUNT_SHORT_MASK, flags)
|
||||
retain_segments = []
|
||||
ref_segments = []
|
||||
|
||||
if ref_count < REF_COUNT_LONG:
|
||||
for bit_pos in range(5):
|
||||
retain_segments.append(bit_set(bit_pos, flags))
|
||||
else:
|
||||
field += self.stream.read(3)
|
||||
[ref_count] = unpack(">L", field)
|
||||
ref_count = masked_value(REF_COUNT_LONG_MASK, ref_count)
|
||||
ret_bytes_count = int(math.ceil((ref_count + 1) / 8))
|
||||
for ret_byte_index in range(ret_bytes_count):
|
||||
[ret_byte] = unpack(">B", self.stream.read(1))
|
||||
for bit_pos in range(7):
|
||||
retain_segments.append(bit_set(bit_pos, ret_byte))
|
||||
|
||||
seg_num = segment["number"]
|
||||
if seg_num <= 256:
|
||||
ref_format = ">B"
|
||||
elif seg_num <= 65536:
|
||||
ref_format = ">I"
|
||||
else:
|
||||
ref_format = ">L"
|
||||
|
||||
ref_size = calcsize(ref_format)
|
||||
|
||||
for ref_index in range(ref_count):
|
||||
ref = self.stream.read(ref_size)
|
||||
[ref] = unpack(ref_format, ref)
|
||||
ref_segments.append(ref)
|
||||
|
||||
return {
|
||||
"ref_count": ref_count,
|
||||
"retain_segments": retain_segments,
|
||||
"ref_segments": ref_segments,
|
||||
}
|
||||
|
||||
def parse_page_assoc(self, segment, page, field):
|
||||
if segment["flags"]["page_assoc_long"]:
|
||||
field += self.stream.read(3)
|
||||
[page] = unpack(">L", field)
|
||||
return page
|
||||
|
||||
def parse_data_length(self, segment, length, field):
|
||||
if length:
|
||||
if (segment["flags"]["type"] == SEG_TYPE_IMMEDIATE_GEN_REGION) \
|
||||
and (length == DATA_LEN_UNKNOWN):
|
||||
|
||||
raise NotImplementedError(
|
||||
"Working with unknown segment length "
|
||||
"is not implemented yet"
|
||||
)
|
||||
else:
|
||||
segment["raw_data"] = self.stream.read(length)
|
||||
|
||||
return length
|
||||
|
||||
|
||||
class JBIG2StreamWriter(object):
|
||||
"""Write JBIG2 segments to a file in JBIG2 format"""
|
||||
|
||||
def __init__(self, stream):
|
||||
self.stream = stream
|
||||
|
||||
def write_segments(self, segments, fix_last_page=True):
|
||||
data_len = 0
|
||||
current_page = None
|
||||
seg_num = None
|
||||
|
||||
for segment in segments:
|
||||
data = self.encode_segment(segment)
|
||||
self.stream.write(data)
|
||||
data_len += len(data)
|
||||
|
||||
seg_num = segment["number"]
|
||||
|
||||
if fix_last_page:
|
||||
seg_page = segment.get("page_assoc")
|
||||
|
||||
if segment["flags"]["type"] == SEG_TYPE_END_OF_PAGE:
|
||||
current_page = None
|
||||
elif seg_page:
|
||||
current_page = seg_page
|
||||
|
||||
if fix_last_page and current_page and (seg_num is not None):
|
||||
segment = self.get_eop_segment(seg_num + 1, current_page)
|
||||
data = self.encode_segment(segment)
|
||||
self.stream.write(data)
|
||||
data_len += len(data)
|
||||
|
||||
return data_len
|
||||
|
||||
def write_file(self, segments, fix_last_page=True):
|
||||
header = FILE_HEADER_ID
|
||||
header_flags = FILE_HEAD_FLAG_SEQUENTIAL | FILE_HEAD_FLAG_PAGES_UNKNOWN
|
||||
header += pack(">B", header_flags)
|
||||
self.stream.write(header)
|
||||
data_len = len(header)
|
||||
|
||||
data_len += self.write_segments(segments, fix_last_page)
|
||||
|
||||
seg_num = 0
|
||||
for segment in segments:
|
||||
seg_num = segment["number"]
|
||||
|
||||
eof_segment = self.get_eof_segment(seg_num + 1)
|
||||
data = self.encode_segment(eof_segment)
|
||||
|
||||
self.stream.write(data)
|
||||
data_len += len(data)
|
||||
|
||||
return data_len
|
||||
|
||||
def encode_segment(self, segment):
|
||||
data = b''
|
||||
for field_format, name in SEG_STRUCT:
|
||||
value = segment.get(name)
|
||||
encoder = getattr(self, "encode_%s" % name, None)
|
||||
if callable(encoder):
|
||||
field = encoder(value, segment)
|
||||
else:
|
||||
field = pack(field_format, value)
|
||||
data += field
|
||||
return data
|
||||
|
||||
def encode_flags(self, value, segment):
|
||||
flags = 0
|
||||
if value.get("deferred"):
|
||||
flags |= HEADER_FLAG_DEFERRED
|
||||
|
||||
if "page_assoc_long" in value:
|
||||
flags |= HEADER_FLAG_PAGE_ASSOC_LONG \
|
||||
if value["page_assoc_long"] else flags
|
||||
else:
|
||||
flags |= HEADER_FLAG_PAGE_ASSOC_LONG \
|
||||
if segment.get("page", 0) > 255 else flags
|
||||
|
||||
flags |= mask_value(SEG_TYPE_MASK, value["type"])
|
||||
|
||||
return pack(">B", flags)
|
||||
|
||||
def encode_retention_flags(self, value, segment):
|
||||
flags = []
|
||||
flags_format = ">B"
|
||||
ref_count = value["ref_count"]
|
||||
retain_segments = value.get("retain_segments", [])
|
||||
|
||||
if ref_count <= 4:
|
||||
flags_byte = mask_value(REF_COUNT_SHORT_MASK, ref_count)
|
||||
for ref_index, ref_retain in enumerate(retain_segments):
|
||||
flags_byte |= 1 << ref_index
|
||||
flags.append(flags_byte)
|
||||
else:
|
||||
bytes_count = math.ceil((ref_count + 1) / 8)
|
||||
flags_format = ">L" + ("B" * bytes_count)
|
||||
flags_dword = mask_value(
|
||||
REF_COUNT_SHORT_MASK,
|
||||
REF_COUNT_LONG
|
||||
) << 24
|
||||
flags.append(flags_dword)
|
||||
|
||||
for byte_index in range(bytes_count):
|
||||
ret_byte = 0
|
||||
ret_part = retain_segments[byte_index * 8:byte_index * 8 + 8]
|
||||
for bit_pos, ret_seg in enumerate(ret_part):
|
||||
ret_byte |= 1 << bit_pos if ret_seg else ret_byte
|
||||
|
||||
flags.append(ret_byte)
|
||||
|
||||
ref_segments = value.get("ref_segments", [])
|
||||
|
||||
seg_num = segment["number"]
|
||||
if seg_num <= 256:
|
||||
ref_format = "B"
|
||||
elif seg_num <= 65536:
|
||||
ref_format = "I"
|
||||
else:
|
||||
ref_format = "L"
|
||||
|
||||
for ref in ref_segments:
|
||||
flags_format += ref_format
|
||||
flags.append(ref)
|
||||
|
||||
return pack(flags_format, *flags)
|
||||
|
||||
def encode_data_length(self, value, segment):
|
||||
data = pack(">L", value)
|
||||
data += segment["raw_data"]
|
||||
return data
|
||||
|
||||
def get_eop_segment(self, seg_number, page_number):
|
||||
return {
|
||||
'data_length': 0,
|
||||
'flags': {'deferred': False, 'type': SEG_TYPE_END_OF_PAGE},
|
||||
'number': seg_number,
|
||||
'page_assoc': page_number,
|
||||
'raw_data': b'',
|
||||
'retention_flags': {
|
||||
'ref_count': 0,
|
||||
'ref_segments': [],
|
||||
'retain_segments': []
|
||||
}
|
||||
}
|
||||
|
||||
def get_eof_segment(self, seg_number):
|
||||
return {
|
||||
'data_length': 0,
|
||||
'flags': {'deferred': False, 'type': SEG_TYPE_END_OF_FILE},
|
||||
'number': seg_number,
|
||||
'page_assoc': 0,
|
||||
'raw_data': b'',
|
||||
'retention_flags': {
|
||||
'ref_count': 0,
|
||||
'ref_segments': [],
|
||||
'retain_segments': []
|
||||
}
|
||||
}
|
|
@ -1,18 +1,15 @@
|
|||
from sortedcontainers import SortedListWithKey
|
||||
import heapq
|
||||
|
||||
from .utils import INF
|
||||
from .utils import Plane
|
||||
from .utils import get_bound
|
||||
from .utils import uniq
|
||||
from .utils import fsplit
|
||||
from .utils import bbox2str
|
||||
from .utils import matrix2str
|
||||
from .utils import apply_matrix_pt
|
||||
from .utils import bbox2str
|
||||
from .utils import fsplit
|
||||
from .utils import get_bound
|
||||
from .utils import matrix2str
|
||||
from .utils import uniq
|
||||
|
||||
import six # Python 2+3 compatibility
|
||||
|
||||
## IndexAssigner
|
||||
##
|
||||
class IndexAssigner(object):
|
||||
|
||||
def __init__(self, index=0):
|
||||
|
@ -29,9 +26,33 @@ class IndexAssigner(object):
|
|||
return
|
||||
|
||||
|
||||
## LAParams
|
||||
##
|
||||
class LAParams(object):
|
||||
"""Parameters for layout analysis
|
||||
|
||||
:param line_overlap: If two characters have more overlap than this they
|
||||
are considered to be on the same line. The overlap is specified
|
||||
relative to the minimum height of both characters.
|
||||
:param char_margin: If two characters are closer together than this
|
||||
margin they are considered to be part of the same word. If
|
||||
characters are on the same line but not part of the same word, an
|
||||
intermediate space is inserted. The margin is specified relative to
|
||||
the width of the character.
|
||||
:param word_margin: If two words are are closer together than this
|
||||
margin they are considered to be part of the same line. A space is
|
||||
added in between for readability. The margin is specified relative
|
||||
to the width of the word.
|
||||
:param line_margin: If two lines are are close together they are
|
||||
considered to be part of the same paragraph. The margin is
|
||||
specified relative to the height of a line.
|
||||
:param boxes_flow: Specifies how much a horizontal and vertical position
|
||||
of a text matters when determining the order of lines. The value
|
||||
should be within the range of -1.0 (only horizontal position
|
||||
matters) to +1.0 (only vertical position matters).
|
||||
:param detect_vertical: If vertical text should be considered during
|
||||
layout analysis
|
||||
:param all_texts: If layout analysis should be performed on text in
|
||||
figures.
|
||||
"""
|
||||
|
||||
def __init__(self,
|
||||
line_overlap=0.5,
|
||||
|
@ -55,30 +76,28 @@ class LAParams(object):
|
|||
(self.char_margin, self.line_margin, self.word_margin, self.all_texts))
|
||||
|
||||
|
||||
## LTItem
|
||||
##
|
||||
class LTItem(object):
|
||||
"""Interface for things that can be analyzed"""
|
||||
|
||||
def analyze(self, laparams):
|
||||
"""Perform the layout analysis."""
|
||||
return
|
||||
|
||||
|
||||
## LTText
|
||||
##
|
||||
class LTText(object):
|
||||
"""Interface for things that have text"""
|
||||
|
||||
def __repr__(self):
|
||||
return ('<%s %r>' %
|
||||
(self.__class__.__name__, self.get_text()))
|
||||
|
||||
def get_text(self):
|
||||
"""Text contained in this object"""
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
## LTComponent
|
||||
##
|
||||
class LTComponent(LTItem):
|
||||
"""Object with a bounding box"""
|
||||
|
||||
def __init__(self, bbox):
|
||||
LTItem.__init__(self)
|
||||
|
@ -92,10 +111,13 @@ class LTComponent(LTItem):
|
|||
# Disable comparison.
|
||||
def __lt__(self, _):
|
||||
raise ValueError
|
||||
|
||||
def __le__(self, _):
|
||||
raise ValueError
|
||||
|
||||
def __gt__(self, _):
|
||||
raise ValueError
|
||||
|
||||
def __ge__(self, _):
|
||||
raise ValueError
|
||||
|
||||
|
@ -150,9 +172,8 @@ class LTComponent(LTItem):
|
|||
return 0
|
||||
|
||||
|
||||
## LTCurve
|
||||
##
|
||||
class LTCurve(LTComponent):
|
||||
"""A generic Bezier curve"""
|
||||
|
||||
def __init__(self, linewidth, pts, stroke = False, fill = False, evenodd = False, stroking_color = None, non_stroking_color = None):
|
||||
LTComponent.__init__(self, get_bound(pts))
|
||||
|
@ -169,18 +190,22 @@ class LTCurve(LTComponent):
|
|||
return ','.join('%.3f,%.3f' % p for p in self.pts)
|
||||
|
||||
|
||||
## LTLine
|
||||
##
|
||||
class LTLine(LTCurve):
|
||||
"""A single straight line.
|
||||
|
||||
Could be used for separating text or figures.
|
||||
"""
|
||||
|
||||
def __init__(self, linewidth, p0, p1, stroke = False, fill = False, evenodd = False, stroking_color = None, non_stroking_color = None):
|
||||
LTCurve.__init__(self, linewidth, [p0, p1], stroke, fill, evenodd, stroking_color, non_stroking_color)
|
||||
return
|
||||
|
||||
|
||||
## LTRect
|
||||
##
|
||||
class LTRect(LTCurve):
|
||||
"""A rectangle.
|
||||
|
||||
Could be used for framing another pictures or figures.
|
||||
"""
|
||||
|
||||
def __init__(self, linewidth, bbox, stroke = False, fill = False, evenodd = False, stroking_color = None, non_stroking_color = None):
|
||||
(x0, y0, x1, y1) = bbox
|
||||
|
@ -188,9 +213,11 @@ class LTRect(LTCurve):
|
|||
return
|
||||
|
||||
|
||||
## LTImage
|
||||
##
|
||||
class LTImage(LTComponent):
|
||||
"""An image object.
|
||||
|
||||
Embedded images can be in JPEG, Bitmap or JBIG2.
|
||||
"""
|
||||
|
||||
def __init__(self, name, stream, bbox):
|
||||
LTComponent.__init__(self, bbox)
|
||||
|
@ -211,9 +238,13 @@ class LTImage(LTComponent):
|
|||
bbox2str(self.bbox), self.srcsize))
|
||||
|
||||
|
||||
## LTAnno
|
||||
##
|
||||
class LTAnno(LTItem, LTText):
|
||||
"""Actual letter in the text as a Unicode string.
|
||||
|
||||
Note that, while a LTChar object has actual boundaries, LTAnno objects does
|
||||
not, as these are "virtual" characters, inserted by a layout analyzer
|
||||
according to the relationship between two characters (e.g. a space).
|
||||
"""
|
||||
|
||||
def __init__(self, text):
|
||||
self._text = text
|
||||
|
@ -223,9 +254,8 @@ class LTAnno(LTItem, LTText):
|
|||
return self._text
|
||||
|
||||
|
||||
## LTChar
|
||||
##
|
||||
class LTChar(LTComponent, LTText):
|
||||
"""Actual letter in the text as a Unicode string."""
|
||||
|
||||
def __init__(self, matrix, font, fontsize, scaling, rise,
|
||||
text, textwidth, textdisp, ncs, graphicstate):
|
||||
|
@ -286,9 +316,8 @@ class LTChar(LTComponent, LTText):
|
|||
return True
|
||||
|
||||
|
||||
## LTContainer
|
||||
##
|
||||
class LTContainer(LTComponent):
|
||||
"""Object that can be extended and analyzed"""
|
||||
|
||||
def __init__(self, bbox):
|
||||
LTComponent.__init__(self, bbox)
|
||||
|
@ -316,10 +345,7 @@ class LTContainer(LTComponent):
|
|||
return
|
||||
|
||||
|
||||
## LTExpandableContainer
|
||||
##
|
||||
class LTExpandableContainer(LTContainer):
|
||||
|
||||
def __init__(self):
|
||||
LTContainer.__init__(self, (+INF, +INF, -INF, -INF))
|
||||
return
|
||||
|
@ -331,10 +357,7 @@ class LTExpandableContainer(LTContainer):
|
|||
return
|
||||
|
||||
|
||||
## LTTextContainer
|
||||
##
|
||||
class LTTextContainer(LTExpandableContainer, LTText):
|
||||
|
||||
def __init__(self):
|
||||
LTText.__init__(self)
|
||||
LTExpandableContainer.__init__(self)
|
||||
|
@ -344,9 +367,12 @@ class LTTextContainer(LTExpandableContainer, LTText):
|
|||
return ''.join(obj.get_text() for obj in self if isinstance(obj, LTText))
|
||||
|
||||
|
||||
## LTTextLine
|
||||
##
|
||||
class LTTextLine(LTTextContainer):
|
||||
"""Contains a list of LTChar objects that represent a single text line.
|
||||
|
||||
The characters are aligned either horizontally or vertically, depending on
|
||||
the text's writing mode.
|
||||
"""
|
||||
|
||||
def __init__(self, word_margin):
|
||||
LTTextContainer.__init__(self)
|
||||
|
@ -368,7 +394,6 @@ class LTTextLine(LTTextContainer):
|
|||
|
||||
|
||||
class LTTextLineHorizontal(LTTextLine):
|
||||
|
||||
def __init__(self, word_margin):
|
||||
LTTextLine.__init__(self, word_margin)
|
||||
self._x1 = +INF
|
||||
|
@ -394,7 +419,6 @@ class LTTextLineHorizontal(LTTextLine):
|
|||
|
||||
|
||||
class LTTextLineVertical(LTTextLine):
|
||||
|
||||
def __init__(self, word_margin):
|
||||
LTTextLine.__init__(self, word_margin)
|
||||
self._y0 = -INF
|
||||
|
@ -419,12 +443,13 @@ class LTTextLineVertical(LTTextLine):
|
|||
abs(obj.y1-self.y1) < d))]
|
||||
|
||||
|
||||
## LTTextBox
|
||||
##
|
||||
## A set of text objects that are grouped within
|
||||
## a certain rectangular area.
|
||||
##
|
||||
class LTTextBox(LTTextContainer):
|
||||
"""Represents a group of text chunks in a rectangular area.
|
||||
|
||||
Note that this box is created by geometric analysis and does not necessarily
|
||||
represents a logical boundary of the text. It contains a list of
|
||||
LTTextLine objects.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
LTTextContainer.__init__(self)
|
||||
|
@ -438,7 +463,6 @@ class LTTextBox(LTTextContainer):
|
|||
|
||||
|
||||
class LTTextBoxHorizontal(LTTextBox):
|
||||
|
||||
def analyze(self, laparams):
|
||||
LTTextBox.analyze(self, laparams)
|
||||
self._objs.sort(key=lambda obj: -obj.y1)
|
||||
|
@ -449,7 +473,6 @@ class LTTextBoxHorizontal(LTTextBox):
|
|||
|
||||
|
||||
class LTTextBoxVertical(LTTextBox):
|
||||
|
||||
def analyze(self, laparams):
|
||||
LTTextBox.analyze(self, laparams)
|
||||
self._objs.sort(key=lambda obj: -obj.x1)
|
||||
|
@ -459,10 +482,7 @@ class LTTextBoxVertical(LTTextBox):
|
|||
return 'tb-rl'
|
||||
|
||||
|
||||
## LTTextGroup
|
||||
##
|
||||
class LTTextGroup(LTTextContainer):
|
||||
|
||||
def __init__(self, objs):
|
||||
LTTextContainer.__init__(self)
|
||||
self.extend(objs)
|
||||
|
@ -470,7 +490,6 @@ class LTTextGroup(LTTextContainer):
|
|||
|
||||
|
||||
class LTTextGroupLRTB(LTTextGroup):
|
||||
|
||||
def analyze(self, laparams):
|
||||
LTTextGroup.analyze(self, laparams)
|
||||
# reorder the objects from top-left to bottom-right.
|
||||
|
@ -481,7 +500,6 @@ class LTTextGroupLRTB(LTTextGroup):
|
|||
|
||||
|
||||
class LTTextGroupTBRL(LTTextGroup):
|
||||
|
||||
def analyze(self, laparams):
|
||||
LTTextGroup.analyze(self, laparams)
|
||||
# reorder the objects from top-right to bottom-left.
|
||||
|
@ -491,10 +509,7 @@ class LTTextGroupTBRL(LTTextGroup):
|
|||
return
|
||||
|
||||
|
||||
## LTLayoutContainer
|
||||
##
|
||||
class LTLayoutContainer(LTContainer):
|
||||
|
||||
def __init__(self, bbox):
|
||||
LTContainer.__init__(self, bbox)
|
||||
self.groups = None
|
||||
|
@ -603,9 +618,22 @@ class LTLayoutContainer(LTContainer):
|
|||
yield box
|
||||
return
|
||||
|
||||
# group_textboxes: group textboxes hierarchically.
|
||||
def group_textboxes(self, laparams, boxes):
|
||||
assert boxes, str((laparams, boxes))
|
||||
"""Group textboxes hierarchically.
|
||||
|
||||
Get pair-wise distances, via dist func defined below, and then merge from the closest textbox pair. Once
|
||||
obj1 and obj2 are merged / grouped, the resulting group is considered as a new object, and its distances to
|
||||
other objects & groups are added to the process queue.
|
||||
|
||||
For performance reason, pair-wise distances and object pair info are maintained in a heap of
|
||||
(idx, dist, id(obj1), id(obj2), obj1, obj2) tuples. It ensures quick access to the smallest element. Note that
|
||||
since comparison operators, e.g., __lt__, are disabled for LTComponent, id(obj) has to appear before obj in
|
||||
element tuples.
|
||||
|
||||
:param laparams: LAParams object.
|
||||
:param boxes: All textbox objects to be grouped.
|
||||
:return: a list that has only one element, the final top level textbox.
|
||||
"""
|
||||
|
||||
def dist(obj1, obj2):
|
||||
"""A distance function between two TextBoxes.
|
||||
|
@ -626,8 +654,7 @@ class LTLayoutContainer(LTContainer):
|
|||
return ((x1-x0)*(y1-y0) - obj1.width*obj1.height - obj2.width*obj2.height)
|
||||
|
||||
def isany(obj1, obj2):
|
||||
"""Check if there's any other object between obj1 and obj2.
|
||||
"""
|
||||
"""Check if there's any other object between obj1 and obj2."""
|
||||
x0 = min(obj1.x0, obj2.x0)
|
||||
y0 = min(obj1.y0, obj2.y0)
|
||||
x1 = max(obj1.x1, obj2.x1)
|
||||
|
@ -635,39 +662,36 @@ class LTLayoutContainer(LTContainer):
|
|||
objs = set(plane.find((x0, y0, x1, y1)))
|
||||
return objs.difference((obj1, obj2))
|
||||
|
||||
def key_obj(t):
|
||||
(c,d,_,_) = t
|
||||
return (c,d)
|
||||
|
||||
dists = SortedListWithKey(key=key_obj)
|
||||
dists = []
|
||||
for i in range(len(boxes)):
|
||||
obj1 = boxes[i]
|
||||
for j in range(i+1, len(boxes)):
|
||||
obj2 = boxes[j]
|
||||
dists.add((0, dist(obj1, obj2), obj1, obj2))
|
||||
dists.append((True, dist(obj1, obj2), id(obj1), id(obj2), obj1, obj2))
|
||||
heapq.heapify(dists)
|
||||
|
||||
plane = Plane(self.bbox)
|
||||
plane.extend(boxes)
|
||||
while dists:
|
||||
(c, d, obj1, obj2) = dists.pop(0)
|
||||
if c == 0 and isany(obj1, obj2):
|
||||
dists.add((1, d, obj1, obj2))
|
||||
done = set()
|
||||
while len(dists) > 0:
|
||||
(is_first, d, id1, id2, obj1, obj2) = heapq.heappop(dists)
|
||||
# Skip objects that are already merged
|
||||
if (id1 not in done) and (id2 not in done):
|
||||
if is_first and isany(obj1, obj2):
|
||||
heapq.heappush(dists, (False, d, id1, id2, obj1, obj2))
|
||||
continue
|
||||
if (isinstance(obj1, (LTTextBoxVertical, LTTextGroupTBRL)) or
|
||||
isinstance(obj2, (LTTextBoxVertical, LTTextGroupTBRL))):
|
||||
if isinstance(obj1, (LTTextBoxVertical, LTTextGroupTBRL)) or \
|
||||
isinstance(obj2, (LTTextBoxVertical, LTTextGroupTBRL)):
|
||||
group = LTTextGroupTBRL([obj1, obj2])
|
||||
else:
|
||||
group = LTTextGroupLRTB([obj1, obj2])
|
||||
plane.remove(obj1)
|
||||
plane.remove(obj2)
|
||||
removed = [obj1, obj2]
|
||||
to_remove = [ (c,d,obj1,obj2) for (c,d,obj1,obj2) in dists
|
||||
if (obj1 in removed or obj2 in removed) ]
|
||||
for r in to_remove:
|
||||
dists.remove(r)
|
||||
done.update([id1, id2])
|
||||
|
||||
for other in plane:
|
||||
dists.add((0, dist(group, other), group, other))
|
||||
heapq.heappush(dists, (False, dist(group, other), id(group), id(other), group, other))
|
||||
plane.add(group)
|
||||
assert len(plane) == 1, str(len(plane))
|
||||
return list(plane)
|
||||
|
||||
def analyze(self, laparams):
|
||||
|
@ -701,9 +725,13 @@ class LTLayoutContainer(LTContainer):
|
|||
return
|
||||
|
||||
|
||||
## LTFigure
|
||||
##
|
||||
class LTFigure(LTLayoutContainer):
|
||||
"""Represents an area used by PDF Form objects.
|
||||
|
||||
PDF Forms can be used to present figures or pictures by embedding yet
|
||||
another PDF document within a page. Note that LTFigure objects can appear
|
||||
recursively.
|
||||
"""
|
||||
|
||||
def __init__(self, name, bbox, matrix):
|
||||
self.name = name
|
||||
|
@ -726,9 +754,12 @@ class LTFigure(LTLayoutContainer):
|
|||
return
|
||||
|
||||
|
||||
## LTPage
|
||||
##
|
||||
class LTPage(LTLayoutContainer):
|
||||
"""Represents an entire page.
|
||||
|
||||
May contain child objects like LTTextBox, LTFigure, LTImage, LTRect,
|
||||
LTCurve and LTLine.
|
||||
"""
|
||||
|
||||
def __init__(self, pageid, bbox, rotate=0):
|
||||
LTLayoutContainer.__init__(self, bbox)
|
||||
|
|
|
@ -5,10 +5,13 @@ import six #Python 2+3 compatibility
|
|||
|
||||
import logging
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class CorruptDataError(Exception):
|
||||
pass
|
||||
|
||||
|
||||
## LZWDecoder
|
||||
##
|
||||
class LZWDecoder(object):
|
||||
|
@ -90,7 +93,7 @@ class LZWDecoder(object):
|
|||
# just ignore corrupt data and stop yielding there
|
||||
break
|
||||
yield x
|
||||
logging.debug('nbits=%d, code=%d, output=%r, table=%r' %
|
||||
logger.debug('nbits=%d, code=%d, output=%r, table=%r' %
|
||||
(self.nbits, code, x, self.table[258:]))
|
||||
return
|
||||
|
||||
|
|
|
@ -2,13 +2,13 @@
|
|||
|
||||
import six
|
||||
|
||||
from . import utils
|
||||
from .pdffont import PDFUnicodeNotDefined
|
||||
|
||||
from . import utils
|
||||
|
||||
## PDFDevice
|
||||
##
|
||||
class PDFDevice(object):
|
||||
"""Translate the output of PDFPageInterpreter to the output that is needed
|
||||
"""
|
||||
|
||||
def __init__(self, rsrcmgr):
|
||||
self.rsrcmgr = rsrcmgr
|
||||
|
|
|
@ -671,7 +671,11 @@ class PDFDocument(object):
|
|||
|
||||
# can raise PDFObjectNotFound
|
||||
def getobj(self, objid):
|
||||
assert objid != 0
|
||||
"""Get object from PDF
|
||||
|
||||
:raises PDFException if PDFDocument is not initialized
|
||||
:raises PDFObjectNotFound if objid does not exist in PDF
|
||||
"""
|
||||
if not self.xrefs:
|
||||
raise PDFException('PDFDocument is not initialized')
|
||||
log.debug('getobj: objid=%r', objid)
|
||||
|
|
|
@ -318,9 +318,8 @@ class PDFContentParser(PSStackParser):
|
|||
return
|
||||
|
||||
|
||||
## Interpreter
|
||||
##
|
||||
class PDFPageInterpreter(object):
|
||||
"""Processor for the content of a PDF page"""
|
||||
|
||||
def __init__(self, rsrcmgr, device):
|
||||
self.rsrcmgr = rsrcmgr
|
||||
|
|
|
@ -27,7 +27,7 @@ LITERALS_ASCIIHEX_DECODE = (LIT('ASCIIHexDecode'), LIT('AHx'))
|
|||
LITERALS_RUNLENGTH_DECODE = (LIT('RunLengthDecode'), LIT('RL'))
|
||||
LITERALS_CCITTFAX_DECODE = (LIT('CCITTFaxDecode'), LIT('CCF'))
|
||||
LITERALS_DCT_DECODE = (LIT('DCTDecode'), LIT('DCT'))
|
||||
|
||||
LITERALS_JBIG2_DECODE = (LIT('JBIG2Decode'),)
|
||||
|
||||
## PDF Objects
|
||||
##
|
||||
|
@ -275,6 +275,8 @@ class PDFStream(PDFObject):
|
|||
# This is probably a JPG stream - it does not need to be decoded twice.
|
||||
# Just return the stream to the user.
|
||||
pass
|
||||
elif f in LITERALS_JBIG2_DECODE:
|
||||
pass
|
||||
elif f == LITERAL_CRYPT:
|
||||
# not yet..
|
||||
raise PDFNotImplementedError('/Crypt filter is unsupported')
|
||||
|
|
|
@ -1,8 +1 @@
|
|||
STRICT = False
|
||||
|
||||
try:
|
||||
from django.conf import settings
|
||||
STRICT = getattr(settings, 'PDF_MINER_IS_STRICT', STRICT)
|
||||
except Exception:
|
||||
# in case it's not a django project
|
||||
pass
|
||||
|
|
|
@ -20,6 +20,17 @@ jo.pdf:
|
|||
(File generated from jo.tex by LaTeX and dvi2pdfm)
|
||||
|
||||
--
|
||||
contrib/matplotlib.pdf
|
||||
Copyright 2018, James R Barlow
|
||||
Example file created in matplotlib to add a Type3 font to the samples
|
||||
Released under the terms of the "LICENSE" file
|
||||
|
||||
--
|
||||
nonfree/cmp_itext_logo.pdf
|
||||
Bruno Lowagie
|
||||
"iText Logo - Type 3 font"
|
||||
http://gitlab.itextsupport.com/itext/sandbox/raw/master/cmpfiles/fonts/cmp_itext_logo.pdf
|
||||
|
||||
nonfree/dmca.pdf:
|
||||
U.S. Copyright Office
|
||||
The Digital Millenium Copyright Act
|
||||
|
|
Binary file not shown.
Binary file not shown.
File diff suppressed because it is too large
Load Diff
|
@ -1,23 +0,0 @@
|
|||
<?xml version="1.0" encoding="utf-8" ?>
|
||||
<pages>
|
||||
<page id="1" bbox="0.000,0.000,595.000,842.000" rotate="0">
|
||||
<textbox id="0" bbox="56.800,771.508,90.688,787.264">
|
||||
<textline bbox="56.800,771.508,90.688,787.264">
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="56.800,771.508,63.472,787.264" size="15.756">S</text>
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="63.484,771.508,68.800,787.264" size="15.756">e</text>
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="68.788,771.508,74.104,787.264" size="15.756">c</text>
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="74.092,771.508,78.088,787.264" size="15.756">r</text>
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="78.088,771.508,83.404,787.264" size="15.756">e</text>
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="83.392,771.508,86.716,787.264" size="15.756">t</text>
|
||||
<text font="BAAAAA+TimesNewRomanPSMT" bbox="86.692,771.508,90.688,787.264" size="15.756">!</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<figure name="Tr4" bbox="-9.000,420.000,595.000,840.100">
|
||||
</figure>
|
||||
<layout>
|
||||
<textbox id="0" bbox="56.800,771.508,90.688,787.264" />
|
||||
</layout>
|
||||
</page>
|
||||
</pages>
|
|
@ -1,72 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:792px; height:612px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:715px; top:114px; width:11px; height:28px;"><span style="font-family: Ryumin-Light; font-size:11px"> 序
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:715px; top:374px; width:11px; height:9px;"><span style="font-family: Ryumin-Light; font-size:11px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:168px; top:105px; width:502px; height:210px;"><span style="font-family: Ryumin-Light; font-size:11px">わたくしといふ現象は
|
||||
<br>假定された有機交流電燈の
|
||||
<br>ひとつの青い照明です
|
||||
<br>(あらゆる透明な幽霊の複合体)
|
||||
<br>風景やみんなといっしょに
|
||||
<br>せはしくせはしく明滅しながら
|
||||
<br>いかにもたしかにともりつづける
|
||||
<br>因果交流電燈の
|
||||
<br>ひとつの青い照明です
|
||||
<br>(ひかりはたもち、その電燈は失はれ)
|
||||
<br>
|
||||
<br>これらは二十二箇月の
|
||||
<br>過去とかんずる方角から
|
||||
<br>紙と鑛質インクをつらね
|
||||
<br>(すべてわたくしと明滅し
|
||||
<br> みんなが同時に感ずるもの)
|
||||
<br>ここまでたもちつゞけられた
|
||||
<br>かげとひかりのひとくさりづつ
|
||||
<br>そのとほりの心象スケッチです
|
||||
<br>
|
||||
<br>これらについて人や銀河や修羅や海膽は
|
||||
<br>宇宙塵をたべ、または空気や塩水を呼吸しながら
|
||||
<br>それぞれ新鮮な本体論もかんがへませうが
|
||||
<br>それらも畢竟こゝろのひとつの風物です
|
||||
<br>たゞたしかに記録されたこれらのけしきは
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">記録されたそのとほりのこのけしきで
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">それが虚無ならば虚無自身がこのとほりで
|
||||
<br>ある程度まではみんなに共通いたします
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">(すべてがわたくしの中のみんなであるやうに
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px"> みんなのおのおののなかのすべてですから)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:101px; top:374px; width:536px; height:191px;"><span style="font-family: Ryumin-Light; font-size:11px">けれどもこれら新世代沖積世の
|
||||
<br>巨大に明るい時間の集積のなかで
|
||||
<br>正しくうつされた筈のこれらのことばが
|
||||
<br>わづかその一點にも均しい明暗のうちに
|
||||
<br> (あるひは修羅の十億年)
|
||||
<br>すでにはやくもその組立や質を變じ
|
||||
<br>しかもわたくしも印刷者も
|
||||
<br>それを変らないとして感ずることは
|
||||
<br>傾向としてはあり得ます
|
||||
<br>けだしわれわれがわれわれの感官や
|
||||
<br>風景や人物をかんずるやうに
|
||||
<br>そしてたゞ共通に感ずるだけであるやうに
|
||||
<br>記録や歴史、あるひは地史といふものも
|
||||
<br>それのいろいろの論料といっしょに
|
||||
<br>(因果の時空的制約のもとに)
|
||||
<br>われわれがかんじてゐるのに過ぎません
|
||||
<br>おそらくこれから二千年もたったころは
|
||||
<br>それ相當のちがった地質學が流用され
|
||||
<br>相當した證據もまた次次過去から現出し
|
||||
<br>みんなは二千年ぐらゐ前には
|
||||
<br>青ぞらいっぱいの無色な孔雀が居たとおもひ
|
||||
<br>新進の大學士たちは気圏のいちばんの上層
|
||||
<br>きらびやかな氷窒素のあたりから
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">すてきな化石を發堀したり
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">あるひは白堊紀砂岩の層面に
|
||||
<br>透明な人類の巨大な足跡を
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">発見するかもしれません
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">すべてこれらの命題は
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">心象や時間それ自身の性質として
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">第四次延長のなかで主張されます
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:11px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:74px; top:471px; width:11px; height:143px;"><span style="font-family: Ryumin-Light; font-size:11px">大正十三年一月廿日 宮澤賢治
|
||||
<br></span></div><div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
173
samples/jo.tex
173
samples/jo.tex
|
@ -1,173 +0,0 @@
|
|||
\documentclass[landscape,twocolumn]{tarticle}
|
||||
|
||||
\setlength{\hoffset}{-0.6in}
|
||||
\setlength{\voffset}{-0.7in}
|
||||
|
||||
\setlength{\textwidth}{18cm}
|
||||
%\setlength{\textheight}{9in}
|
||||
|
||||
%\setlength{\oddsidemargin}{-0.5in}
|
||||
%\setlength{\evensidemargin}{-0.5in}
|
||||
\setlength{\topmargin}{0in}
|
||||
\setlength{\columnsep}{0.4in}
|
||||
|
||||
\pagestyle{empty}
|
||||
\makeatletter
|
||||
\def\kanjistrut{\vrule \@height0.88zw \@depth0.12zw \@width\z@}
|
||||
\newdimen\mytempdima
|
||||
\newcommand{\ruby}[2]{%
|
||||
\leavevmode
|
||||
\setbox0=\hbox{#1}%
|
||||
\mytempdima=\f@size\p@
|
||||
\setbox1=\hbox{\fontsize{0.5\mytempdima}{0pt}\selectfont #2}%
|
||||
\ifdim\wd0>\wd1 \dimen0=\wd0 \else \dimen0=\wd1 \fi
|
||||
\hbox{%
|
||||
\kanjiskip=0pt plus 2fil
|
||||
\xkanjiskip=0pt plus 2fil
|
||||
\vbox{%
|
||||
\hbox to \dimen0{%
|
||||
\fontsize{0.5\mytempdima}{0pt}\selectfont \kanjistrut\hfil#2\hfil}%
|
||||
\nointerlineskip
|
||||
\hbox to \dimen0{\kanjistrut\hfil#1\hfil}}}}
|
||||
\makeatother
|
||||
|
||||
\begin{document}
|
||||
|
||||
序
|
||||
\vspace{0.4in}
|
||||
|
||||
\begin{flushleft}
|
||||
わたくしといふ現象は
|
||||
|
||||
假定された有機交流電燈の
|
||||
|
||||
ひとつの青い照明です
|
||||
|
||||
(あらゆる透明な幽霊の複合体)
|
||||
|
||||
風景やみんなといっしょに
|
||||
|
||||
せはしくせはしく明滅しながら
|
||||
|
||||
いかにもたしかにともりつづける
|
||||
|
||||
因果交流電燈の
|
||||
|
||||
ひとつの青い照明です
|
||||
|
||||
(ひかりはたもち、その電燈は失はれ)
|
||||
|
||||
|
||||
|
||||
これらは二十二箇月の
|
||||
|
||||
過去とかんずる方角から
|
||||
|
||||
紙と鑛質インクをつらね
|
||||
|
||||
(すべてわたくしと明滅し
|
||||
|
||||
みんなが同時に感ずるもの)
|
||||
|
||||
ここまでたもちつゞけられた
|
||||
|
||||
かげとひかりのひとくさりづつ
|
||||
|
||||
そのとほりの心象スケッチです
|
||||
|
||||
|
||||
|
||||
これらについて人や銀河や修羅や海膽は
|
||||
|
||||
宇宙塵をたべ、または空気や塩水を呼吸しながら
|
||||
|
||||
それぞれ新鮮な本体論もかんがへませうが
|
||||
|
||||
それらも畢竟こゝろのひとつの風物です
|
||||
|
||||
たゞたしかに記録されたこれらのけしきは
|
||||
|
||||
記録されたそのとほりのこのけしきで
|
||||
|
||||
それが虚無ならば虚無自身がこのとほりで
|
||||
|
||||
ある程度まではみんなに共通いたします
|
||||
|
||||
(すべてがわたくしの中のみんなであるやうに
|
||||
|
||||
みんなのおのおののなかのすべてですから)
|
||||
\newpage
|
||||
|
||||
|
||||
\vspace{1.0in}
|
||||
|
||||
けれどもこれら新世代沖積世の
|
||||
|
||||
巨大に明るい時間の集積のなかで
|
||||
|
||||
正しくうつされた筈のこれらのことばが
|
||||
|
||||
わづかその一點にも均しい明暗のうちに
|
||||
|
||||
(あるひは修羅の十億年)
|
||||
|
||||
すでにはやくもその組立や質を變じ
|
||||
|
||||
しかもわたくしも印刷者も
|
||||
|
||||
それを変らないとして感ずることは
|
||||
|
||||
傾向としてはあり得ます
|
||||
|
||||
けだしわれわれがわれわれの感官や
|
||||
|
||||
風景や人物をかんずるやうに
|
||||
|
||||
そしてたゞ共通に感ずるだけであるやうに
|
||||
|
||||
記録や歴史、あるひは地史といふものも
|
||||
|
||||
それのいろいろの論料といっしょに
|
||||
|
||||
(因果の時空的制約のもとに)
|
||||
|
||||
われわれがかんじてゐるのに過ぎません
|
||||
|
||||
おそらくこれから二千年もたったころは
|
||||
|
||||
それ相當のちがった地質學が流用され
|
||||
|
||||
相當した證據もまた次次過去から現出し
|
||||
|
||||
みんなは二千年ぐらゐ前には
|
||||
|
||||
青ぞらいっぱいの無色な孔雀が居たとおもひ
|
||||
|
||||
新進の大學士たちは気圏のいちばんの上層
|
||||
|
||||
きらびやかな氷窒素のあたりから
|
||||
|
||||
すてきな化石を發堀したり
|
||||
|
||||
あるひは白堊紀砂岩の層面に
|
||||
|
||||
透明な人類の巨大な足跡を
|
||||
|
||||
発見するかもしれません
|
||||
|
||||
|
||||
|
||||
すべてこれらの命題は
|
||||
|
||||
心象や時間それ自身の性質として
|
||||
|
||||
第四次延長のなかで主張されます
|
||||
|
||||
|
||||
\end{flushleft}
|
||||
|
||||
\begin{flushright}
|
||||
大正十三年一月廿日 宮澤賢治
|
||||
\end{flushright}
|
||||
|
||||
\end{document}
|
|
@ -1,71 +0,0 @@
|
|||
序
|
||||
|
||||
|
||||
|
||||
わたくしといふ現象は
|
||||
假定された有機交流電燈の
|
||||
ひとつの青い照明です
|
||||
(あらゆる透明な幽霊の複合体)
|
||||
風景やみんなといっしょに
|
||||
せはしくせはしく明滅しながら
|
||||
いかにもたしかにともりつづける
|
||||
因果交流電燈の
|
||||
ひとつの青い照明です
|
||||
(ひかりはたもち、その電燈は失はれ)
|
||||
|
||||
これらは二十二箇月の
|
||||
過去とかんずる方角から
|
||||
紙と鑛質インクをつらね
|
||||
(すべてわたくしと明滅し
|
||||
みんなが同時に感ずるもの)
|
||||
ここまでたもちつゞけられた
|
||||
かげとひかりのひとくさりづつ
|
||||
そのとほりの心象スケッチです
|
||||
|
||||
これらについて人や銀河や修羅や海膽は
|
||||
宇宙塵をたべ、または空気や塩水を呼吸しながら
|
||||
それぞれ新鮮な本体論もかんがへませうが
|
||||
それらも畢竟こゝろのひとつの風物です
|
||||
たゞたしかに記録されたこれらのけしきは
|
||||
記録されたそのとほりのこのけしきで
|
||||
それが虚無ならば虚無自身がこのとほりで
|
||||
ある程度まではみんなに共通いたします
|
||||
(すべてがわたくしの中のみんなであるやうに
|
||||
みんなのおのおののなかのすべてですから)
|
||||
|
||||
けれどもこれら新世代沖積世の
|
||||
巨大に明るい時間の集積のなかで
|
||||
正しくうつされた筈のこれらのことばが
|
||||
わづかその一點にも均しい明暗のうちに
|
||||
(あるひは修羅の十億年)
|
||||
すでにはやくもその組立や質を變じ
|
||||
しかもわたくしも印刷者も
|
||||
それを変らないとして感ずることは
|
||||
傾向としてはあり得ます
|
||||
けだしわれわれがわれわれの感官や
|
||||
風景や人物をかんずるやうに
|
||||
そしてたゞ共通に感ずるだけであるやうに
|
||||
記録や歴史、あるひは地史といふものも
|
||||
それのいろいろの論料といっしょに
|
||||
(因果の時空的制約のもとに)
|
||||
われわれがかんじてゐるのに過ぎません
|
||||
おそらくこれから二千年もたったころは
|
||||
それ相當のちがった地質學が流用され
|
||||
相當した證據もまた次次過去から現出し
|
||||
みんなは二千年ぐらゐ前には
|
||||
青ぞらいっぱいの無色な孔雀が居たとおもひ
|
||||
新進の大學士たちは気圏のいちばんの上層
|
||||
きらびやかな氷窒素のあたりから
|
||||
すてきな化石を發堀したり
|
||||
あるひは白堊紀砂岩の層面に
|
||||
透明な人類の巨大な足跡を
|
||||
発見するかもしれません
|
||||
|
||||
すべてこれらの命題は
|
||||
心象や時間それ自身の性質として
|
||||
第四次延長のなかで主張されます
|
||||
|
||||
|
||||
大正十三年一月廿日 宮澤賢治
|
||||
|
||||
|
1188
samples/jo.xml.ref
1188
samples/jo.xml.ref
File diff suppressed because it is too large
Load Diff
Binary file not shown.
|
@ -1,50 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:792px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:138px; top:113px; width:335px; height:17px;"><span style="font-family: Garamond,Bold; font-size:17px">T</span><span style="font-family: Garamond,Bold; font-size:13px">HE </span><span style="font-family: Garamond,Bold; font-size:17px">D</span><span style="font-family: Garamond,Bold; font-size:13px">IGITAL </span><span style="font-family: Garamond,Bold; font-size:17px">M</span><span style="font-family: Garamond,Bold; font-size:13px">ILLENNIUM </span><span style="font-family: Garamond,Bold; font-size:17px">C</span><span style="font-family: Garamond,Bold; font-size:13px">OPYRIGHT </span><span style="font-family: Garamond,Bold; font-size:17px">A</span><span style="font-family: Garamond,Bold; font-size:13px">CT OF </span><span style="font-family: Garamond,Bold; font-size:17px">1998
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:218px; top:132px; width:174px; height:14px;"><span style="font-family: Garamond,Bold; font-size:14px">U.S. Copyright Office Summary
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:72px; top:240px; width:93px; height:16px;"><span style="font-family: Garamond,Bold; font-size:15px">I</span><span style="font-family: Garamond,Bold; font-size:12px">NTRODUCTION
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:267px; top:214px; width:76px; height:13px;"><span style="font-family: Garamond,Bold; font-size:13px">December 1998
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:108px; top:270px; width:396px; height:67px;"><span style="font-family: Garamond; font-size:13px">The Digital Millennium Copyright Act (DMCA) was signed into law by
|
||||
<br>President Clinton on October 28, 1998. The legislation implements two 1996 World
|
||||
<br>Intellectual Property Organization (WIPO) treaties: the WIPO Copyright Treaty and
|
||||
<br>the WIPO Performances and Phonograms Treaty. The DMCA also addresses a
|
||||
<br>number of other significant copyright-related issues.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:381px; top:272px; width:3px; height:8px;"><span style="font-family: Garamond; font-size:8px">1
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:144px; top:351px; width:179px; height:13px;"><span style="font-family: Garamond; font-size:13px">The DMCA is divided into five titles:
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:144px; top:379px; width:8px; height:12px;"><span style="font-family: ELCKGH+WPTypographicSymbols; font-size:12px">!
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:144px; top:419px; width:8px; height:12px;"><span style="font-family: ELCKGH+WPTypographicSymbols; font-size:12px">!
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:144px; top:460px; width:8px; height:12px;"><span style="font-family: ELCKGH+WPTypographicSymbols; font-size:12px">!
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:143px; top:500px; width:8px; height:12px;"><span style="font-family: ELCKGH+WPTypographicSymbols; font-size:12px">!
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:144px; top:581px; width:8px; height:12px;"><span style="font-family: ELCKGH+WPTypographicSymbols; font-size:12px">!
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:179px; top:377px; width:324px; height:228px;"><span style="font-family: Garamond; font-size:13px">Title I, the “</span><span style="font-family: Garamond,Bold; font-size:13px">WIPO Copyright and Performances and Phonograms
|
||||
<br>Treaties Implementation Act of 1998</span><span style="font-family: Garamond; font-size:13px">,” implements the WIPO
|
||||
<br>treaties.
|
||||
<br>Title II, the “</span><span style="font-family: Garamond,Bold; font-size:13px">Online Copyright Infringement Liability Limitation
|
||||
<br>Act</span><span style="font-family: Garamond; font-size:13px">,” creates limitations on the liability of online service providers for
|
||||
<br>copyright infringement when engaging in certain types of activities.
|
||||
<br>Title III, the “</span><span style="font-family: Garamond,Bold; font-size:13px">Computer Maintenance Competition Assurance
|
||||
<br>Act</span><span style="font-family: Garamond; font-size:13px">,” creates an exemption for making a copy of a computer program
|
||||
<br>by activating a computer for purposes of maintenance or repair.
|
||||
<br>Title IV contains six </span><span style="font-family: Garamond,Bold; font-size:13px">miscellaneous provisions</span><span style="font-family: Garamond; font-size:13px">, relating to the
|
||||
<br>functions of the Copyright Office, distance education, the exceptions
|
||||
<br>in the Copyright Act for libraries and for making ephemeral recordings,
|
||||
<br>“webcasting” of sound recordings on the Internet, and the applicability
|
||||
<br>of collective bargaining agreement obligations in the case of transfers
|
||||
<br>of rights in motion pictures.
|
||||
<br>Title V, the “</span><span style="font-family: Garamond,Bold; font-size:13px">Vessel Hull Design Protection Act</span><span style="font-family: Garamond; font-size:13px">,” creates a new form
|
||||
<br>of protection for the design of vessel hulls.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:107px; top:619px; width:396px; height:53px;"><span style="font-family: Garamond; font-size:13px">This memorandum summarizes briefly each title of the DMCA. It provides
|
||||
<br>merely an overview of the law’s provisions; for purposes of length and readability a
|
||||
<br>significant amount of detail has been omitted. </span><span style="font-family: Garamond,Bold; font-size:13px">A complete understanding of any
|
||||
<br>provision of the DMCA requires reference to the text of the legislation itself.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:144px; top:726px; width:228px; height:12px;"><span style="font-family: Garamond; font-size:12px">Pub. L. No. 105-304, 112 Stat. 2860 (Oct. 28, 1998).
|
||||
<br></span><span style="font-family: Garamond; font-size:8px">1
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:108px; top:750px; width:106px; height:13px;"><span style="font-family: Garamond,Italic; font-size:13px">Copyright Office Summary
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:274px; top:750px; width:63px; height:13px;"><span style="font-family: Garamond,Italic; font-size:13px">December 1998
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:476px; top:750px; width:27px; height:13px;"><span style="font-family: Garamond,Italic; font-size:13px">Page 1
|
||||
<br></span></div><span style="position:absolute; border: black 1px solid; left:108px; top:719px; width:144px; height:1px;"></span>
|
||||
<div style="position:absolute; border: figure 1px solid; writing-mode:False; left:285px; top:163px; width:44px; height:42px;"></div><div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,61 +0,0 @@
|
|||
THE DIGITAL MILLENNIUM COPYRIGHT ACT OF 1998
|
||||
|
||||
U.S. Copyright Office Summary
|
||||
|
||||
INTRODUCTION
|
||||
|
||||
December 1998
|
||||
|
||||
The Digital Millennium Copyright Act (DMCA) was signed into law by
|
||||
President Clinton on October 28, 1998. The legislation implements two 1996 World
|
||||
Intellectual Property Organization (WIPO) treaties: the WIPO Copyright Treaty and
|
||||
the WIPO Performances and Phonograms Treaty. The DMCA also addresses a
|
||||
number of other significant copyright-related issues.
|
||||
|
||||
1
|
||||
|
||||
The DMCA is divided into five titles:
|
||||
|
||||
!
|
||||
|
||||
!
|
||||
|
||||
!
|
||||
|
||||
!
|
||||
|
||||
!
|
||||
|
||||
Title I, the “WIPO Copyright and Performances and Phonograms
|
||||
Treaties Implementation Act of 1998,” implements the WIPO
|
||||
treaties.
|
||||
Title II, the “Online Copyright Infringement Liability Limitation
|
||||
Act,” creates limitations on the liability of online service providers for
|
||||
copyright infringement when engaging in certain types of activities.
|
||||
Title III, the “Computer Maintenance Competition Assurance
|
||||
Act,” creates an exemption for making a copy of a computer program
|
||||
by activating a computer for purposes of maintenance or repair.
|
||||
Title IV contains six miscellaneous provisions, relating to the
|
||||
functions of the Copyright Office, distance education, the exceptions
|
||||
in the Copyright Act for libraries and for making ephemeral recordings,
|
||||
“webcasting” of sound recordings on the Internet, and the applicability
|
||||
of collective bargaining agreement obligations in the case of transfers
|
||||
of rights in motion pictures.
|
||||
Title V, the “Vessel Hull Design Protection Act,” creates a new form
|
||||
of protection for the design of vessel hulls.
|
||||
|
||||
This memorandum summarizes briefly each title of the DMCA. It provides
|
||||
merely an overview of the law’s provisions; for purposes of length and readability a
|
||||
significant amount of detail has been omitted. A complete understanding of any
|
||||
provision of the DMCA requires reference to the text of the legislation itself.
|
||||
|
||||
Pub. L. No. 105-304, 112 Stat. 2860 (Oct. 28, 1998).
|
||||
1
|
||||
|
||||
Copyright Office Summary
|
||||
|
||||
December 1998
|
||||
|
||||
Page 1
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -1,475 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:611px; height:791px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:490px; top:85px; width:65px; height:16px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">OMB No. 1545-0074
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:499px; top:93px; width:45px; height:34px;"><span style="font-family: HelveticaNeue-Bold; font-size:26px">20</span><span style="font-family: Helvetica-Condensed-Black; font-size:27px">07
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:425px; top:123px; width:111px; height:18px;"><span style="font-family: HelveticaNeue-Bold; font-size:9px">Identifying number (see page 8)
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:480px; top:150px; width:2px; height:28px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px"> I
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:425px; top:149px; width:32px; height:18px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">Check if:
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:425px; top:150px; width:112px; height:33px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">ndividual
|
||||
<br>
|
||||
<br>Estate or Trust
|
||||
<br>Type of entry visa (see page 8)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:437px; top:113px; width:12px; height:16px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">, 20
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:425px; top:184px; width:6px; height:14px;"><span style="font-family: Universal-NewswithCommPi; font-size:5px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:61px; top:148px; width:330px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">resent home address (number, street, and apt. no., or rural route). If you have a P.O. box, see page 8.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:56px; top:157px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:56px; top:172px; width:323px; height:18px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">City, town or post office, state, and ZIP code. If you have a foreign address, see page 8.
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:56px; top:196px; width:233px; height:37px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">Country </span><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:9px">Give address </span><span style="font-family: HelveticaNeue-Bold; font-size:9px">outside the United States </span><span style="font-family: HelveticaNeue-Roman; font-size:9px">to which you want any
|
||||
<br>refund check mailed. If same as above, write “Same.”
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:88px; top:207px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:251px; top:196px; width:195px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">Of what country were you a </span><span style="font-family: HelveticaNeue-Bold; font-size:9px">citizen </span><span style="font-family: HelveticaNeue-Roman; font-size:9px">or national during the tax year? </span><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:316px; top:208px; width:241px; height:25px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">Give address in the country where you are a </span><span style="font-family: HelveticaNeue-Bold; font-size:9px">permanent resident.
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:9px">If same as above, write “Same.”
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:439px; top:207px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:176px; top:82px; width:269px; height:17px;"><span style="font-family: FranklinGothic-Demi; font-size:16px">U.S. Nonresident Alien Income Tax Return
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:146px; top:113px; width:30px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">beginning
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:211px; top:100px; width:195px; height:16px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">For the year January 1–December 31, 2007, or other tax year
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:260px; top:113px; width:61px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">, 2007, and ending
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:251px; top:123px; width:37px; height:18px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">Last name
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:55px; top:79px; width:66px; height:32px;"><span style="font-family: Helvetica-Condensed-Black; font-size:30px">1040NR
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:34px; top:97px; width:87px; height:32px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">Form
|
||||
<br>Department of the Treasury
|
||||
<br>Internal Revenue Service
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:56px; top:123px; width:93px; height:9px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">Your first name and initial
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:56px; top:132px; width:4px; height:24px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px"> </span><span style="font-family: HelveticaNeue-Roman; font-size:8px">P
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:36px; top:146px; width:9px; height:78px;"><span style="font-family: HelveticaNeue-Bold; font-size:5px">P</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">l</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">ea</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">p</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">i</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">y</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">p</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:37px; top:221px; width:9px; height:2px;"><span style="font-family: HelveticaNeue-Bold; font-size:2px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:40px; top:273px; width:8px; height:159px;"><span style="font-family: HelveticaNeue-Bold; font-size:4px">A</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">l</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">tt</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">c</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">h</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">F</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:6px">m</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">(</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">)</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:3px">1099</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">-</span><span style="font-family: HelveticaNeue-Bold; font-size:5px">R</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:1px">i</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">f</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">x</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:5px">w</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:5px">w</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">i</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">hh</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">l</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">d</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:33px; top:312px; width:8px; height:80px;"><span style="font-family: HelveticaNeue-Bold; font-size:4px">A</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">tt</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">c</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">h</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">F</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:6px">m</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:6px">W</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">-</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">2</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">h</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:502px; top:243px; width:10px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">7a
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:538px; top:243px; width:10px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">7b
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:494px; top:256px; width:27px; height:9px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">Yourself
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:532px; top:256px; width:25px; height:9px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">Spouse
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:483px; top:286px; width:3px; height:41px;"><span style="font-family: Universal-GreekwithMathPi; font-size:25px">兵</span><span style="font-family: HelveticaNeue-Roman; font-size:27px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:528px; top:344px; width:7px; height:6px;"><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:528px; top:374px; width:7px; height:6px;"><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:528px; top:398px; width:7px; height:6px;"><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:528px; top:415px; width:7px; height:6px;"><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:528px; top:438px; width:7px; height:6px;"><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:535px; top:354px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:535px; top:384px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:535px; top:409px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:535px; top:426px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:535px; top:448px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:453px; top:339px; width:73px; height:86px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">No. of boxes checked
|
||||
<br>on 7a and 7b
|
||||
<br>
|
||||
<br>No. of children on
|
||||
<br>7c who:
|
||||
<br>
|
||||
<br></span><span style="font-family: Universal-NewswithCommPi; font-size:6px">● </span><span style="font-family: HelveticaNeue-Bold; font-size:8px">lived with you
|
||||
<br> </span><span style="font-family: Universal-NewswithCommPi; font-size:6px">● </span><span style="font-family: HelveticaNeue-Bold; font-size:8px">did not live with
|
||||
<br>you due to divorce
|
||||
<br>or separation
|
||||
<br>
|
||||
<br>Dependents on 7c
|
||||
<br>not entered above
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:458px; top:432px; width:68px; height:8px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">dd numbers entered
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:453px; top:439px; width:49px; height:15px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">on lines above
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:459px; top:449px; width:10px; height:23px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">8
|
||||
<br>
|
||||
<br>9a
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:453px; top:424px; width:4px; height:16px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px"> A
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:457px; top:485px; width:15px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">10a
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:457px; top:509px; width:15px; height:132px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">11
|
||||
<br>12
|
||||
<br>13
|
||||
<br>14
|
||||
<br>15
|
||||
<br>16b
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">17b
|
||||
<br>18
|
||||
<br>19
|
||||
<br>20
|
||||
<br>21
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:459px; top:654px; width:10px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">23
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:459px; top:786px; width:10px; height:23px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">34</span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br>35</span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:484px; top:810px; width:78px; height:11px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">Form </span><span style="font-family: HelveticaNeue-Bold; font-size:11px">1040NR </span><span style="font-family: HelveticaNeue-Roman; font-size:8px">(2007)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:543px; top:820px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:338px; top:297px; width:140px; height:19px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">If you check box 7b, enter your spouse’s
|
||||
<br>identifying number </span><span style="font-family: Universal-NewswithCommPi; font-size:5px">䊳
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:331px; top:371px; width:36px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">relationship
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:339px; top:378px; width:20px; height:15px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">to you
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:391px; top:364px; width:43px; height:29px;"><span style="font-family: Helvetica-Condensed-Bold; font-size:8px">(4)
|
||||
<br></span><span style="font-family: Helvetica-Condensed; font-size:8px">if qualifying
|
||||
<br>child for child tax
|
||||
<br>credit (see page 9)
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:304px; top:254px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:413px; top:321px; width:1px; height:6px;"><span style="font-family: HelveticaNeue-Roman; font-size:6px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:350px; top:282px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:331px; top:290px; width:3px; height:41px;"><span style="font-family: Universal-GreekwithMathPi; font-size:25px">其</span><span style="font-family: HelveticaNeue-Roman; font-size:27px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:56px; top:255px; width:367px; height:119px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">Filing status. Check only one box (1–6 below).
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">1
|
||||
<br>
|
||||
<br>2
|
||||
<br>
|
||||
<br>3
|
||||
<br>
|
||||
<br>4
|
||||
<br>
|
||||
<br>5
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">6
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">Caution: Do not </span><span style="font-family: HelveticaNeue-Roman; font-size:9px">check box 7a if your parent (or someone else) can claim you as a dependent.
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">Do not </span><span style="font-family: HelveticaNeue-Roman; font-size:9px">check box 7b if your spouse had any U.S. gross income.
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">7c
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:92px; top:271px; width:253px; height:81px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">Single resident of Canada or Mexico, or a single U.S. national
|
||||
<br>Other single nonresident alien
|
||||
<br>Married resident of Canada or Mexico, or a married U.S. national
|
||||
<br>
|
||||
<br>Married resident of the Republic of Korea (South Korea)
|
||||
<br>
|
||||
<br>Other married nonresident alien
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br>Qualifying widow(er) with dependent child (see page 9)
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:72px; top:365px; width:83px; height:8px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">Dependents: </span><span style="font-family: HelveticaNeue-Roman; font-size:8px">(see page 9)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:324px; top:364px; width:50px; height:8px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">(3)</span><span style="font-family: HelveticaNeue-Roman; font-size:8px"> Dependent’s
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:219px; top:294px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:90px; top:361px; width:2px; height:9px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:111px; top:243px; width:243px; height:10px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">Filing Status and Exemptions for Individuals </span><span style="font-family: HelveticaNeue-Roman; font-size:10px">(see page 8)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:117px; top:373px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:72px; top:378px; width:43px; height:8px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">(1)</span><span style="font-family: HelveticaNeue-Roman; font-size:8px"> First name
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:80px; top:385px; width:1px; height:8px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:169px; top:378px; width:33px; height:15px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">Last name
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:239px; top:367px; width:58px; height:24px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">(2) </span><span style="font-family: HelveticaNeue-Roman; font-size:8px">Dependent’s
|
||||
<br>identifying number
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:249px; top:385px; width:2px; height:52px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br>.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:9px">.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:9px">.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:275px; top:385px; width:2px; height:52px;"><span style="font-family: HelveticaNeue-Roman; font-size:9px">.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br>.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:9px">.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:9px">.
|
||||
<br>.
|
||||
<br>.
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:435px; top:664px; width:2px; height:6px;"><span style="font-family: Universal-NewswithCommPi; font-size:6px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:281px; top:486px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:318px; top:727px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:435px; top:607px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:340px; top:668px; width:2px; height:21px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:358px; top:473px; width:10px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">9b
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:355px; top:497px; width:15px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">10b
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:315px; top:571px; width:15px; height:23px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">16b
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">17b
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:211px; top:569px; width:15px; height:33px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">16a
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br>17a
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:85px; top:438px; width:357px; height:377px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">Total number of exemptions claimed
|
||||
<br>
|
||||
<br>Wages, salaries, tips, etc. Attach Form(s) W-2
|
||||
<br>
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">Taxable </span><span style="font-family: HelveticaNeue-Roman; font-size:10px">interest
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">Tax-exempt </span><span style="font-family: HelveticaNeue-Roman; font-size:10px">interest. </span><span style="font-family: HelveticaNeue-Bold; font-size:10px">Do not </span><span style="font-family: HelveticaNeue-Roman; font-size:10px">include on line 9a
|
||||
<br>Ordinary dividends
|
||||
<br>
|
||||
<br>Qualified dividends (see page 11)
|
||||
<br>Taxable refunds, credits, or offsets of state and local income taxes (see page 11)
|
||||
<br>
|
||||
<br>Scholarship and fellowship grants. Attach Form(s) 1042-S or required statement (see page 11)
|
||||
<br>
|
||||
<br>Business income or (loss). Attach Schedule C or C-EZ (Form 1040)
|
||||
<br>
|
||||
<br>Capital gain or (loss). Attach Schedule D (Form 1040) if required. If not required, check here
|
||||
<br>
|
||||
<br>Other gains or (losses). Attach Form 4797
|
||||
<br>IRA distributions
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br>Pensions and annuities
|
||||
<br>Rental real estate, royalties, partnerships, trusts, etc. Attach Schedule E (Form 1040)
|
||||
<br>Farm income or (loss). Attach Schedule F (Form 1040)
|
||||
<br>
|
||||
<br>Unemployment compensation
|
||||
<br>Other income. List type and amount (see page 15)
|
||||
<br>Total income exempt by a treaty from page 5, Item M
|
||||
<br>Add lines 8, 9a, 10a, 11–15, 16b, and 17b–21. This is your </span><span style="font-family: HelveticaNeue-Bold; font-size:10px">total effectively connected income </span><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br>Educator expenses (see page 15)
|
||||
<br>Health savings account deduction. Attach Form 8889
|
||||
<br>Moving expenses. Attach Form 3903
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">Self-employed SEP, SIMPLE, and qualified plans
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br>Self-employed health insurance deduction (see page 16)
|
||||
<br>Penalty on early withdrawal of savings
|
||||
<br>
|
||||
<br>Scholarship and fellowship grants excluded
|
||||
<br>
|
||||
<br>IRA deduction (see page 16)
|
||||
<br>
|
||||
<br>Student loan interest deduction (see page 16)
|
||||
<br>
|
||||
<br>Domestic production activities deduction. Attach Form 8903
|
||||
<br>
|
||||
<br>Add lines 24 through 33
|
||||
<br>Subtract line 34 from line 23. Enter here and on line 36. This is your </span><span style="font-family: HelveticaNeue-Bold; font-size:10px">adjusted gross income </span><span style="font-family: Universal-NewswithCommPi; font-size:6px">䊳
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span><span style="font-family: Universal-NewswithCommPi; font-size:6px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:358px; top:666px; width:10px; height:120px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">24
|
||||
<br>25
|
||||
<br>26
|
||||
<br>27
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">28
|
||||
<br>29
|
||||
<br>30
|
||||
<br>31
|
||||
<br>32
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">33
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:334px; top:572px; width:116px; height:33px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">Taxable amount (see page 12)
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br>Taxable amount (see page 13)
|
||||
<br></span><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:358px; top:642px; width:10px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">22
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:184px; top:595px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:206px; top:631px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:296px; top:644px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:308px; top:691px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:241px; top:703px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:150px; top:474px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:256px; top:571px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:223px; top:511px; width:2px; height:10px;"><span style="font-family: HelveticaNeue-Roman; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:63px; top:437px; width:5px; height:11px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">d
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:65px; top:451px; width:16px; height:59px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">8
|
||||
<br>
|
||||
<br>9a
|
||||
<br>b
|
||||
<br>
|
||||
<br>10a
|
||||
<br>b
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:65px; top:511px; width:15px; height:297px;"><span style="font-family: HelveticaNeue-Bold; font-size:10px">11
|
||||
<br>12
|
||||
<br>13
|
||||
<br>14
|
||||
<br>15
|
||||
<br>16a
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">17a</span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">18
|
||||
<br>19
|
||||
<br>20
|
||||
<br>21
|
||||
<br>22
|
||||
<br>23
|
||||
<br>24
|
||||
<br>25
|
||||
<br>26
|
||||
<br>27
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">28
|
||||
<br>29
|
||||
<br>30
|
||||
<br>31
|
||||
<br>32
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">33
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">34</span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br></span><span style="font-family: HelveticaNeue-Bold; font-size:10px">35</span><span style="font-family: HelveticaNeue-Bold; font-size:10px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:51px; top:462px; width:8px; height:189px;"><span style="font-family: HelveticaNeue-Bold; font-size:2px">I</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">c</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:6px">m</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">E</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">ff</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">ec</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">i</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">v</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">l</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">y</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:5px">C</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">nn</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">c</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">d</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:6px">W</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">i</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">h</span><span style="font-family: HelveticaNeue-Bold; font-size:1px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:5px">U</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">.</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">S</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">. </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">T</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">d</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">/</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">B</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">u</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:1px">i</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">ss
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:52px; top:649px; width:8px; height:1px;"><span style="font-family: HelveticaNeue-Bold; font-size:1px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:50px; top:692px; width:9px; height:90px;"><span style="font-family: HelveticaNeue-Bold; font-size:5px">A</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">d</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">j</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">u</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">d</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:6px">G</span><span style="font-family: HelveticaNeue-Bold; font-size:3px">r</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">ss</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:2px">I</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">c</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:7px">m</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:51px; top:780px; width:9px; height:2px;"><span style="font-family: HelveticaNeue-Bold; font-size:2px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:35px; top:531px; width:9px; height:159px;"><span style="font-family: HelveticaNeue-Bold; font-size:5px">E</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">c</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">l</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">s</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">, </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">b</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">u</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">do</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">o</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">tt</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">ac</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">h</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">, </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">y</span><span style="font-family: HelveticaNeue-Bold; font-size:2px"> </span><span style="font-family: HelveticaNeue-Bold; font-size:4px">p</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">a</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">y</span><span style="font-family: HelveticaNeue-Bold; font-size:7px">m</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">e</span><span style="font-family: HelveticaNeue-Bold; font-size:4px">n</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">t</span><span style="font-family: HelveticaNeue-Bold; font-size:2px">.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:35px; top:688px; width:9px; height:2px;"><span style="font-family: HelveticaNeue-Bold; font-size:2px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:48px; top:431px; width:8px; height:1px;"><span style="font-family: HelveticaNeue-Bold; font-size:1px">
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:34px; top:813px; width:269px; height:16px;"><span style="font-family: HelveticaNeue-Bold; font-size:8px">For Disclosure, Privacy Act, and Paperwork Reduction Act Notice, see page 32.
|
||||
<br>
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:367px; top:813px; width:53px; height:15px;"><span style="font-family: HelveticaNeue-Roman; font-size:8px">Cat. No. 11364D
|
||||
<br>
|
||||
<br></span></div><span style="position:absolute; border: black 1px solid; left:453px; top:496px; width:21px; height:12px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:453px; top:472px; width:21px; height:12px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:453px; top:641px; width:21px; height:12px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:453px; top:665px; width:21px; height:120px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:526px; top:315px; width:36px; height:24px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:526px; top:267px; width:36px; height:24px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:482px; top:95px; width:79px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:135px; top:87px; width:0px; height:35px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:482px; top:87px; width:0px; height:35px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:34px; top:123px; width:528px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:244px; top:123px; width:0px; height:24px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:123px; width:0px; height:686px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:417px; top:123px; width:0px; height:72px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:147px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:171px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:195px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:207px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:309px; top:207px; width:0px; height:36px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:243px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:526px; top:243px; width:0px; height:96px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:243px; width:0px; height:96px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:255px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:267px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:358px; top:279px; width:121px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:279px; width:36px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:226px; top:291px; width:253px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:291px; width:36px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:303px; width:72px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:415px; top:316px; width:61px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:316px; width:72px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:226px; top:328px; width:253px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:328px; width:36px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:322px; top:340px; width:157px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:490px; top:340px; width:72px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:538px; top:352px; width:24px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:364px; width:397px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:381px; top:364px; width:0px; height:72px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:316px; top:364px; width:0px; height:72px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:215px; top:364px; width:0px; height:72px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:388px; width:397px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:538px; top:382px; width:24px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:400px; width:397px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:412px; width:397px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:424px; width:397px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:538px; top:407px; width:24px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:436px; width:397px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:538px; top:424px; width:24px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:238px; top:446px; width:205px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:448px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:475px; top:448px; width:0px; height:361px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:448px; width:0px; height:361px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:540px; top:448px; width:0px; height:361px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:460px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:286px; top:460px; width:157px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:63px; top:448px; width:0px; height:361px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:472px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:154px; top:472px; width:289px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:286px; top:484px; width:61px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:432px; top:475px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:374px; top:475px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:475px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:484px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:166px; top:496px; width:277px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:496px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:430px; top:520px; width:13px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:532px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:430px; top:532px; width:13px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:544px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:370px; top:544px; width:73px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:556px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:262px; top:568px; width:181px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:568px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:287px; top:571px; width:0px; height:21px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:229px; top:571px; width:0px; height:21px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:208px; top:571px; width:0px; height:21px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:154px; top:581px; width:49px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:208px; top:581px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:581px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:190px; top:593px; width:13px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:208px; top:593px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:593px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:442px; top:605px; width:1px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:605px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:310px; top:617px; width:133px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:617px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:214px; top:629px; width:229px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:629px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:296px; top:641px; width:145px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:641px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:430px; top:662px; width:1px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:49px; top:665px; width:513px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:310px; top:689px; width:37px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:677px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:322px; top:725px; width:25px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:701px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:286px; top:713px; width:61px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:725px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:250px; top:737px; width:97px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:737px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:274px; top:749px; width:73px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:773px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:202px; top:760px; width:145px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:785px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:274px; top:773px; width:73px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:34px; top:809px; width:528px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:446px; top:364px; width:0px; height:72px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:310px; top:652px; width:37px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:432px; top:644px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:374px; top:644px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:644px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:653px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:309px; top:571px; width:0px; height:21px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:250px; top:701px; width:97px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:689px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:749px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:761px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:400px; top:367px; width:1px; height:3px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:402px; top:360px; width:5px; height:11px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:244px; top:195px; width:0px; height:12px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:238px; top:677px; width:97px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:226px; top:508px; width:121px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:508px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:374px; top:499px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:499px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:432px; top:499px; width:0px; height:9px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:520px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:713px; width:101px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:334px; top:785px; width:13px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:190px; top:796px; width:253px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:430px; top:807px; width:1px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:454px; top:797px; width:108px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:374px; top:665px; width:0px; height:120px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:467px; top:151px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:467px; top:161px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:77px; top:272px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:77px; top:284px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:77px; top:296px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:77px; top:308px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:77px; top:320px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:77px; top:332px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:409px; top:390px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:409px; top:402px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:409px; top:414px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:409px; top:426px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:539px; top:427px; width:21px; height:18px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:434px; top:549px; width:8px; height:8px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:432px; top:665px; width:0px; height:120px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:352px; top:665px; width:0px; height:120px;"></span>
|
||||
<div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,431 +0,0 @@
|
|||
OMB No. 1545-0074
|
||||
|
||||
|
||||
2007
|
||||
|
||||
Identifying number (see page 8)
|
||||
|
||||
|
||||
I
|
||||
|
||||
Check if:
|
||||
|
||||
|
||||
ndividual
|
||||
|
||||
Estate or Trust
|
||||
Type of entry visa (see page 8)
|
||||
|
||||
, 20
|
||||
|
||||
|
||||
䊳
|
||||
|
||||
resent home address (number, street, and apt. no., or rural route). If you have a P.O. box, see page 8.
|
||||
|
||||
|
||||
|
||||
City, town or post office, state, and ZIP code. If you have a foreign address, see page 8.
|
||||
|
||||
|
||||
Country 䊳
|
||||
Give address outside the United States to which you want any
|
||||
refund check mailed. If same as above, write “Same.”
|
||||
|
||||
|
||||
|
||||
|
||||
Of what country were you a citizen or national during the tax year? 䊳
|
||||
|
||||
Give address in the country where you are a permanent resident.
|
||||
If same as above, write “Same.”
|
||||
|
||||
|
||||
|
||||
|
||||
U.S. Nonresident Alien Income Tax Return
|
||||
|
||||
|
||||
beginning
|
||||
|
||||
For the year January 1–December 31, 2007, or other tax year
|
||||
|
||||
|
||||
, 2007, and ending
|
||||
|
||||
Last name
|
||||
|
||||
|
||||
1040NR
|
||||
|
||||
|
||||
Form
|
||||
Department of the Treasury
|
||||
Internal Revenue Service
|
||||
|
||||
|
||||
Your first name and initial
|
||||
|
||||
P
|
||||
|
||||
Please print or type.
|
||||
|
||||
|
||||
|
||||
Also attach Form(s) 1099-R if tax was withheld.
|
||||
|
||||
Attach Forms W-2 here.
|
||||
|
||||
7a
|
||||
|
||||
7b
|
||||
|
||||
Yourself
|
||||
|
||||
|
||||
Spouse
|
||||
|
||||
|
||||
兵
|
||||
|
||||
䊳
|
||||
|
||||
䊳
|
||||
|
||||
䊳
|
||||
|
||||
䊳
|
||||
|
||||
䊳
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
No. of boxes checked
|
||||
on 7a and 7b
|
||||
|
||||
No. of children on
|
||||
7c who:
|
||||
|
||||
● lived with you
|
||||
● did not live with
|
||||
you due to divorce
|
||||
or separation
|
||||
|
||||
Dependents on 7c
|
||||
not entered above
|
||||
|
||||
dd numbers entered
|
||||
|
||||
on lines above
|
||||
|
||||
|
||||
8
|
||||
|
||||
9a
|
||||
|
||||
A
|
||||
|
||||
10a
|
||||
|
||||
11
|
||||
12
|
||||
13
|
||||
14
|
||||
15
|
||||
16b
|
||||
17b
|
||||
18
|
||||
19
|
||||
20
|
||||
21
|
||||
|
||||
23
|
||||
|
||||
34
|
||||
35
|
||||
|
||||
Form 1040NR (2007)
|
||||
|
||||
|
||||
|
||||
If you check box 7b, enter your spouse’s
|
||||
identifying number 䊳
|
||||
|
||||
relationship
|
||||
|
||||
to you
|
||||
|
||||
|
||||
(4)
|
||||
if qualifying
|
||||
child for child tax
|
||||
credit (see page 9)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
其
|
||||
|
||||
Filing status. Check only one box (1–6 below).
|
||||
|
||||
1
|
||||
|
||||
2
|
||||
|
||||
3
|
||||
|
||||
4
|
||||
|
||||
5
|
||||
|
||||
6
|
||||
|
||||
Caution: Do not check box 7a if your parent (or someone else) can claim you as a dependent.
|
||||
Do not check box 7b if your spouse had any U.S. gross income.
|
||||
7c
|
||||
|
||||
Single resident of Canada or Mexico, or a single U.S. national
|
||||
Other single nonresident alien
|
||||
Married resident of Canada or Mexico, or a married U.S. national
|
||||
|
||||
Married resident of the Republic of Korea (South Korea)
|
||||
|
||||
Other married nonresident alien
|
||||
|
||||
Qualifying widow(er) with dependent child (see page 9)
|
||||
|
||||
|
||||
Dependents: (see page 9)
|
||||
|
||||
(3) Dependent’s
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Filing Status and Exemptions for Individuals (see page 8)
|
||||
|
||||
|
||||
|
||||
(1) First name
|
||||
|
||||
|
||||
|
||||
Last name
|
||||
|
||||
|
||||
(2) Dependent’s
|
||||
identifying number
|
||||
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
.
|
||||
.
|
||||
.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
9b
|
||||
|
||||
10b
|
||||
|
||||
16b
|
||||
17b
|
||||
|
||||
16a
|
||||
|
||||
17a
|
||||
|
||||
|
||||
Total number of exemptions claimed
|
||||
|
||||
Wages, salaries, tips, etc. Attach Form(s) W-2
|
||||
|
||||
Taxable interest
|
||||
Tax-exempt interest. Do not include on line 9a
|
||||
Ordinary dividends
|
||||
|
||||
Qualified dividends (see page 11)
|
||||
Taxable refunds, credits, or offsets of state and local income taxes (see page 11)
|
||||
|
||||
Scholarship and fellowship grants. Attach Form(s) 1042-S or required statement (see page 11)
|
||||
|
||||
Business income or (loss). Attach Schedule C or C-EZ (Form 1040)
|
||||
|
||||
Capital gain or (loss). Attach Schedule D (Form 1040) if required. If not required, check here
|
||||
|
||||
Other gains or (losses). Attach Form 4797
|
||||
IRA distributions
|
||||
|
||||
Pensions and annuities
|
||||
Rental real estate, royalties, partnerships, trusts, etc. Attach Schedule E (Form 1040)
|
||||
Farm income or (loss). Attach Schedule F (Form 1040)
|
||||
|
||||
Unemployment compensation
|
||||
Other income. List type and amount (see page 15)
|
||||
Total income exempt by a treaty from page 5, Item M
|
||||
Add lines 8, 9a, 10a, 11–15, 16b, and 17b–21. This is your total effectively connected income 䊳
|
||||
|
||||
Educator expenses (see page 15)
|
||||
Health savings account deduction. Attach Form 8889
|
||||
Moving expenses. Attach Form 3903
|
||||
Self-employed SEP, SIMPLE, and qualified plans
|
||||
|
||||
Self-employed health insurance deduction (see page 16)
|
||||
Penalty on early withdrawal of savings
|
||||
|
||||
Scholarship and fellowship grants excluded
|
||||
|
||||
IRA deduction (see page 16)
|
||||
|
||||
Student loan interest deduction (see page 16)
|
||||
|
||||
Domestic production activities deduction. Attach Form 8903
|
||||
|
||||
Add lines 24 through 33
|
||||
Subtract line 34 from line 23. Enter here and on line 36. This is your adjusted gross income 䊳
|
||||
|
||||
|
||||
|
||||
24
|
||||
25
|
||||
26
|
||||
27
|
||||
28
|
||||
29
|
||||
30
|
||||
31
|
||||
32
|
||||
33
|
||||
|
||||
Taxable amount (see page 12)
|
||||
|
||||
Taxable amount (see page 13)
|
||||
|
||||
|
||||
22
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
d
|
||||
|
||||
|
||||
8
|
||||
|
||||
9a
|
||||
b
|
||||
|
||||
10a
|
||||
b
|
||||
|
||||
|
||||
11
|
||||
12
|
||||
13
|
||||
14
|
||||
15
|
||||
16a
|
||||
17a
|
||||
18
|
||||
19
|
||||
20
|
||||
21
|
||||
22
|
||||
23
|
||||
24
|
||||
25
|
||||
26
|
||||
27
|
||||
28
|
||||
29
|
||||
30
|
||||
31
|
||||
32
|
||||
33
|
||||
34
|
||||
35
|
||||
|
||||
Income Effectively Connected With U.S. Trade/Business
|
||||
|
||||
|
||||
|
||||
Adjusted Gross Income
|
||||
|
||||
|
||||
|
||||
Enclose, but do not attach, any payment.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
For Disclosure, Privacy Act, and Paperwork Reduction Act Notice, see page 32.
|
||||
|
||||
|
||||
Cat. No. 11364D
|
||||
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -1,209 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:1008px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:36px; top:143px; width:283px; height:37px;"><span style="font-family: PDDIPA+Helvetica; font-size:17px">PAGER/SGML
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:16px">Page 1 of 48 Instructions for Form 1040NR
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:156px; top:134px; width:147px; height:25px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">Userid: ________ DTD INSTR04
|
||||
<br>Fileid:
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:192px; top:150px; width:234px; height:9px;"><span style="font-family: PDDIPA+Helvetica; font-size:9px">D:\USERS\8fllb\documents\epicfiles\2007Instructions1040NR.sgm
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:323px; top:129px; width:154px; height:15px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">Leadpct: 0% Pt. size: 9.5 </span><span style="font-family: PDDJAB+ZapfDingbats; font-size:13px">❏ </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">Draft
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:505px; top:129px; width:57px; height:15px;"><span style="font-family: PDDJAB+ZapfDingbats; font-size:13px">❏ </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">Ok to Print
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:444px; top:149px; width:128px; height:31px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">(Init. & date)
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:16px">7:48 - 6-DEC-2007
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:42px; top:198px; width:489px; height:10px;"><span style="font-family: PDDJAC+Helvetica-Oblique; font-size:10px">The type and rule above prints on all proofs including departmental reproduction proofs. MUST be removed before printing.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:42px; top:285px; width:267px; height:130px;"><span style="font-family: PDDJAD+Helvetica-Bold; font-size:46px">20</span><span style="font-family: PDDJCD+Helvetica-Condensed-Black; font-size:48px">07
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:32px">Instructions for
|
||||
<br>Form 1040NR
|
||||
<br></span><span style="font-family: PDDJCE+FranklinGothic-Demi; font-size:16px">U.S. Nonresident Alien Income Tax Return
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:42px; top:429px; width:157px; height:20px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">Section references are to the Internal
|
||||
<br>Revenue Code unless otherwise noted.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:41px; top:428px; width:337px; height:202px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">use a different address this year. See
|
||||
<br></span><span style="font-family: PDDJAC+Helvetica-Oblique; font-size:10px">Where To File</span><span style="font-family: PDDIPA+Helvetica; font-size:10px"> on page 4.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:20px">General Instructions </span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">deduction. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The deduction rate for
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Domestic production activities
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:16px">What’s New for 2007
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Tax benefits extended. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The following
|
||||
<br>tax benefits were extended through
|
||||
<br>2007.
|
||||
<br></span><span style="font-family: PDDJDF+Symbol; font-size:14px">• </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">Deduction for educator expenses in
|
||||
<br>figuring adjusted gross income.
|
||||
<br></span><span style="font-family: PDDJDF+Symbol; font-size:14px">• </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">District of Columbia first-time
|
||||
<br>homebuyer credit.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Alternative minimum tax (AMT)
|
||||
<br>exemption amount decreased. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The
|
||||
<br>AMT exemption amount is decreased to
|
||||
<br>$33,750 ($45,000 if a qualifying
|
||||
<br>widow(er); $22,500 if married filing
|
||||
<br>separately).
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:337px; top:672px; width:16px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px"> For
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:335px; top:494px; width:42px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px"> If you are
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:221px; top:471px; width:166px; height:533px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">2007 is increased to 6%.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Unreported social security and
|
||||
<br>Medicare tax on wages.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">an employee and your employer did not
|
||||
<br>withhold social security and Medicare
|
||||
<br>tax, see Form 8919 to figure and report
|
||||
<br>this tax.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Refundable credit for prior-year
|
||||
<br>minimum tax.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">If you have an unused
|
||||
<br>minimum tax credit carryforward from
|
||||
<br>2004, see Form 8801 to find if you can
|
||||
<br>take this credit.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Health savings account (HSA)
|
||||
<br>funding distributions. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">You may be
|
||||
<br>able to elect to exclude from income a
|
||||
<br>distribution made from your IRA to your
|
||||
<br>HSA. See the instructions for lines 16a
|
||||
<br>and 16b beginning on page 12.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">New recordkeeping requirements for
|
||||
<br>contributions of money.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">charitable contributions of money,
|
||||
<br>regardless of the amount, you must
|
||||
<br>maintain as a record of the contribution
|
||||
<br>a bank record (such as a cancelled
|
||||
<br>check) or a written record from the
|
||||
<br>charity. The written record must include
|
||||
<br>the name of the charity, date, and
|
||||
<br>amount of the contribution. See </span><span style="font-family: PDDJAC+Helvetica-Oblique; font-size:10px">Gifts to
|
||||
<br>U.S. Charities</span><span style="font-family: PDDIPA+Helvetica; font-size:10px"> that begins on page 26.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Exemption for housing a person
|
||||
<br>displaced by Hurricane Katrina
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">expires. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The additional exemption
|
||||
<br>amount for housing a person displaced
|
||||
<br>by Hurricane Katrina does not apply for
|
||||
<br>2007 or later years.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Telephone excise tax credit.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">credit was available only on your 2006
|
||||
<br>return. If you filed but did not request it
|
||||
<br>on your 2006 return, file Form 1040X
|
||||
<br>using a simplified procedure explained
|
||||
<br>in its instructions to amend your 2006
|
||||
<br>return. If you were not required to file a
|
||||
<br>2006 return, see the 2006 Form
|
||||
<br>1040EZ-T.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:16px">What’s New for 2008
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">IRA deduction expanded. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">You may
|
||||
<br>be able to deduct up to $5,000 ($6,000
|
||||
<br>if age 50 or older at the end of the
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">year). You may be able to take an IRA
|
||||
<br>deduction if you were covered by a
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:355px; top:837px; width:20px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px"> This
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:276px; top:1015px; width:59px; height:9px;"><span style="font-family: PDDIPA+Helvetica; font-size:9px">Cat. No. 11368V
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:53px; top:642px; width:5px; height:20px;"><span style="font-family: PDDJAD+Helvetica-Bold; font-size:20px">!
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:73px; top:637px; width:122px; height:30px;"><span style="font-family: PDDJAC+Helvetica-Oblique; font-size:10px">At the time these instructions
|
||||
<br>went to print, Congress was
|
||||
<br>considering legislation that
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:42px; top:659px; width:166px; height:111px;"><span style="font-family: PDDJEF+Helvetica-Black; font-size:5px">CAUTION
|
||||
<br></span><span style="font-family: PDDJAC+Helvetica-Oblique; font-size:10px">would increase the amounts above. To
|
||||
<br>find out if this legislation was enacted,
|
||||
<br>and for more details, see the
|
||||
<br>Instructions for Form 6251.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">IRA deduction expanded.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">covered by a retirement plan, you may
|
||||
<br>be able to take an IRA deduction if your
|
||||
<br>2007 modified adjusted gross income
|
||||
<br>(AGI) is less than $62,000 ($103,000 if
|
||||
<br>a qualifying widow(er)).
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:163px; top:710px; width:46px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">If you were
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:54px; top:774px; width:150px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">You may be able to deduct up to an
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:41px; top:784px; width:164px; height:220px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">additional $3,000 if you were a
|
||||
<br>participant in a 401(k) plan and your
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">employer was in bankruptcy in an
|
||||
<br>earlier year.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Standard mileage rates. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The 2007
|
||||
<br>rate for business use of your vehicle is
|
||||
<br>48</span><span style="font-family: PDDIPA+Helvetica; font-size:6px">1</span><span style="font-family: PDDIPA+Helvetica; font-size:10px">/</span><span style="font-family: PDDIPA+Helvetica; font-size:6px">2</span><span style="font-family: PDDIPA+Helvetica; font-size:10px"> cents a mile. The 2007 rate for
|
||||
<br>use of your vehicle to move is 20 cents
|
||||
<br>a mile. The special rate for charitable
|
||||
<br>use of your vehicle to provide relief
|
||||
<br>related to Hurricane Katrina has
|
||||
<br>expired.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Elective salary deferrals. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The
|
||||
<br>maximum amount you can defer under
|
||||
<br>all plans is generally limited to $15,500
|
||||
<br>($10,500 if you only have SIMPLE
|
||||
<br>plans; $18,500 for section 403(b) plans
|
||||
<br>if you qualify for the 15-year rule). See
|
||||
<br>the instructions for line 8 on page 10.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Mailing your return.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px"> If you are filing
|
||||
<br>the return for an estate or trust, you will
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:445px; top:299px; width:122px; height:23px;"><span style="font-family: PDDIPA+Helvetica; font-size:11px">Department of the Treasury
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Internal Revenue Service
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:401px; top:428px; width:167px; height:30px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">retirement plan and your 2008 modified
|
||||
<br>AGI is less than $63,000 ($105,000) if a
|
||||
<br>qualifying widow(er)).
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:414px; top:465px; width:150px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">You may be able to deduct up to an
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:401px; top:475px; width:164px; height:239px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">additional $3,000 if you were a
|
||||
<br>participant in a 401(k) plan and your
|
||||
<br>employer was in bankruptcy in an
|
||||
<br>earlier year.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Personal exemption and itemized
|
||||
<br>deduction phaseouts reduced.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">Taxpayers with adjusted gross income
|
||||
<br>above a certain amount may lose part
|
||||
<br>of their deduction for personal
|
||||
<br>exemptions and itemized deductions.
|
||||
<br>The amount by which these deductions
|
||||
<br>are reduced in 2008 will be only </span><span style="font-family: PDDIPA+Helvetica; font-size:6px">1</span><span style="font-family: PDDIPA+Helvetica; font-size:10px">/</span><span style="font-family: PDDIPA+Helvetica; font-size:6px">2</span><span style="font-family: PDDIPA+Helvetica; font-size:10px"> of
|
||||
<br>the amount of the reduction that
|
||||
<br>otherwise would have applied in 2007.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Capital gain tax rate reduced.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px"> The
|
||||
<br>5% capital gain tax rate is reduced to
|
||||
<br>zero.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Tax on children’s income.
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">8615 will be required to figure the tax
|
||||
<br>for the following children with
|
||||
<br>investment income of more than
|
||||
<br>$1,800.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:525px; top:663px; width:24px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px"> Form
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:414px; top:716px; width:151px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">1. Children under age 18 at the end
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:402px; top:726px; width:34px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">of 2008.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:414px; top:736px; width:133px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">2. The following children if their
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:402px; top:746px; width:151px; height:20px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">earned income is not more than half
|
||||
<br>their support.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:414px; top:769px; width:135px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">a. Children age 18 at the end of
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:402px; top:779px; width:23px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">2008.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:414px; top:790px; width:146px; height:10px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">b. Children over age 18 and under
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:401px; top:800px; width:164px; height:204px;"><span style="font-family: PDDIPA+Helvetica; font-size:10px">age 24 at the end of 2008 who are
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">full-time students.
|
||||
<br>The election to report a child’s
|
||||
<br>investment income on a parent’s return
|
||||
<br>and the special rule for when a child
|
||||
<br>must file Form 6251 will also apply to
|
||||
<br>the children listed above.
|
||||
<br></span><span style="font-family: PDDJAD+Helvetica-Bold; font-size:11px">Expiring tax benefits. </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The following
|
||||
<br>benefits are scheduled to expire and
|
||||
<br>will not apply for 2008.
|
||||
<br></span><span style="font-family: PDDJDF+Symbol; font-size:14px">• </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">Deduction for educator expenses in
|
||||
<br>figuring adjusted gross income.
|
||||
<br></span><span style="font-family: PDDJDF+Symbol; font-size:14px">• </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">The exclusion from income of
|
||||
<br>qualified charitable deductions.
|
||||
<br></span><span style="font-family: PDDJDF+Symbol; font-size:14px">• </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">Credit for nonbusiness energy
|
||||
<br>property.
|
||||
<br></span><span style="font-family: PDDJDF+Symbol; font-size:14px">• </span><span style="font-family: PDDIPA+Helvetica; font-size:10px">District of Columbia first-time
|
||||
<br></span><span style="font-family: PDDIPA+Helvetica; font-size:10px">homebuyer credit (for homes
|
||||
<br>purchased after 2007).
|
||||
<br></span></div><span style="position:absolute; border: black 1px solid; left:497px; top:157px; width:67px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:42px; top:184px; width:528px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:428px; top:298px; width:11px; height:25px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:426px; top:298px; width:5px; height:18px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:430px; top:301px; width:7px; height:23px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:430px; top:308px; width:4px; height:13px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:418px; top:304px; width:11px; height:12px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:422px; top:315px; width:8px; height:10px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:414px; top:315px; width:8px; height:10px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:421px; top:309px; width:1px; height:10px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:407px; top:297px; width:17px; height:26px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:420px; top:306px; width:3px; height:1px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:416px; top:318px; width:3px; height:5px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:424px; top:318px; width:3px; height:5px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:42px; top:417px; width:528px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:42px; top:638px; width:27px; height:27px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:45px; top:641px; width:21px; height:18px;"></span>
|
||||
<div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,220 +0,0 @@
|
|||
PAGER/SGML
|
||||
Page 1 of 48 Instructions for Form 1040NR
|
||||
|
||||
Userid: ________ DTD INSTR04
|
||||
Fileid:
|
||||
|
||||
D:\USERS\8fllb\documents\epicfiles\2007Instructions1040NR.sgm
|
||||
|
||||
Leadpct: 0% Pt. size: 9.5 ❏ Draft
|
||||
|
||||
❏ Ok to Print
|
||||
|
||||
(Init. & date)
|
||||
7:48 - 6-DEC-2007
|
||||
|
||||
The type and rule above prints on all proofs including departmental reproduction proofs. MUST be removed before printing.
|
||||
|
||||
2007
|
||||
Instructions for
|
||||
Form 1040NR
|
||||
U.S. Nonresident Alien Income Tax Return
|
||||
|
||||
Section references are to the Internal
|
||||
Revenue Code unless otherwise noted.
|
||||
|
||||
use a different address this year. See
|
||||
Where To File on page 4.
|
||||
General Instructions deduction. The deduction rate for
|
||||
Domestic production activities
|
||||
What’s New for 2007
|
||||
Tax benefits extended. The following
|
||||
tax benefits were extended through
|
||||
2007.
|
||||
• Deduction for educator expenses in
|
||||
figuring adjusted gross income.
|
||||
• District of Columbia first-time
|
||||
homebuyer credit.
|
||||
Alternative minimum tax (AMT)
|
||||
exemption amount decreased. The
|
||||
AMT exemption amount is decreased to
|
||||
$33,750 ($45,000 if a qualifying
|
||||
widow(er); $22,500 if married filing
|
||||
separately).
|
||||
|
||||
For
|
||||
|
||||
If you are
|
||||
|
||||
2007 is increased to 6%.
|
||||
Unreported social security and
|
||||
Medicare tax on wages.
|
||||
an employee and your employer did not
|
||||
withhold social security and Medicare
|
||||
tax, see Form 8919 to figure and report
|
||||
this tax.
|
||||
Refundable credit for prior-year
|
||||
minimum tax.
|
||||
If you have an unused
|
||||
minimum tax credit carryforward from
|
||||
2004, see Form 8801 to find if you can
|
||||
take this credit.
|
||||
Health savings account (HSA)
|
||||
funding distributions. You may be
|
||||
able to elect to exclude from income a
|
||||
distribution made from your IRA to your
|
||||
HSA. See the instructions for lines 16a
|
||||
and 16b beginning on page 12.
|
||||
New recordkeeping requirements for
|
||||
contributions of money.
|
||||
charitable contributions of money,
|
||||
regardless of the amount, you must
|
||||
maintain as a record of the contribution
|
||||
a bank record (such as a cancelled
|
||||
check) or a written record from the
|
||||
charity. The written record must include
|
||||
the name of the charity, date, and
|
||||
amount of the contribution. See Gifts to
|
||||
U.S. Charities that begins on page 26.
|
||||
Exemption for housing a person
|
||||
displaced by Hurricane Katrina
|
||||
expires. The additional exemption
|
||||
amount for housing a person displaced
|
||||
by Hurricane Katrina does not apply for
|
||||
2007 or later years.
|
||||
Telephone excise tax credit.
|
||||
credit was available only on your 2006
|
||||
return. If you filed but did not request it
|
||||
on your 2006 return, file Form 1040X
|
||||
using a simplified procedure explained
|
||||
in its instructions to amend your 2006
|
||||
return. If you were not required to file a
|
||||
2006 return, see the 2006 Form
|
||||
1040EZ-T.
|
||||
What’s New for 2008
|
||||
IRA deduction expanded. You may
|
||||
be able to deduct up to $5,000 ($6,000
|
||||
if age 50 or older at the end of the
|
||||
year). You may be able to take an IRA
|
||||
deduction if you were covered by a
|
||||
|
||||
This
|
||||
|
||||
Cat. No. 11368V
|
||||
|
||||
!
|
||||
|
||||
At the time these instructions
|
||||
went to print, Congress was
|
||||
considering legislation that
|
||||
|
||||
CAUTION
|
||||
would increase the amounts above. To
|
||||
find out if this legislation was enacted,
|
||||
and for more details, see the
|
||||
Instructions for Form 6251.
|
||||
IRA deduction expanded.
|
||||
covered by a retirement plan, you may
|
||||
be able to take an IRA deduction if your
|
||||
2007 modified adjusted gross income
|
||||
(AGI) is less than $62,000 ($103,000 if
|
||||
a qualifying widow(er)).
|
||||
|
||||
If you were
|
||||
|
||||
You may be able to deduct up to an
|
||||
|
||||
additional $3,000 if you were a
|
||||
participant in a 401(k) plan and your
|
||||
employer was in bankruptcy in an
|
||||
earlier year.
|
||||
Standard mileage rates. The 2007
|
||||
rate for business use of your vehicle is
|
||||
481/2 cents a mile. The 2007 rate for
|
||||
use of your vehicle to move is 20 cents
|
||||
a mile. The special rate for charitable
|
||||
use of your vehicle to provide relief
|
||||
related to Hurricane Katrina has
|
||||
expired.
|
||||
Elective salary deferrals. The
|
||||
maximum amount you can defer under
|
||||
all plans is generally limited to $15,500
|
||||
($10,500 if you only have SIMPLE
|
||||
plans; $18,500 for section 403(b) plans
|
||||
if you qualify for the 15-year rule). See
|
||||
the instructions for line 8 on page 10.
|
||||
Mailing your return.
|
||||
If you are filing
|
||||
the return for an estate or trust, you will
|
||||
|
||||
Department of the Treasury
|
||||
Internal Revenue Service
|
||||
|
||||
retirement plan and your 2008 modified
|
||||
AGI is less than $63,000 ($105,000) if a
|
||||
qualifying widow(er)).
|
||||
|
||||
You may be able to deduct up to an
|
||||
|
||||
additional $3,000 if you were a
|
||||
participant in a 401(k) plan and your
|
||||
employer was in bankruptcy in an
|
||||
earlier year.
|
||||
Personal exemption and itemized
|
||||
deduction phaseouts reduced.
|
||||
Taxpayers with adjusted gross income
|
||||
above a certain amount may lose part
|
||||
of their deduction for personal
|
||||
exemptions and itemized deductions.
|
||||
The amount by which these deductions
|
||||
are reduced in 2008 will be only 1/2 of
|
||||
the amount of the reduction that
|
||||
otherwise would have applied in 2007.
|
||||
Capital gain tax rate reduced.
|
||||
The
|
||||
5% capital gain tax rate is reduced to
|
||||
zero.
|
||||
Tax on children’s income.
|
||||
8615 will be required to figure the tax
|
||||
for the following children with
|
||||
investment income of more than
|
||||
$1,800.
|
||||
|
||||
Form
|
||||
|
||||
1. Children under age 18 at the end
|
||||
|
||||
of 2008.
|
||||
|
||||
2. The following children if their
|
||||
|
||||
earned income is not more than half
|
||||
their support.
|
||||
|
||||
a. Children age 18 at the end of
|
||||
|
||||
2008.
|
||||
|
||||
b. Children over age 18 and under
|
||||
|
||||
age 24 at the end of 2008 who are
|
||||
full-time students.
|
||||
The election to report a child’s
|
||||
investment income on a parent’s return
|
||||
and the special rule for when a child
|
||||
must file Form 6251 will also apply to
|
||||
the children listed above.
|
||||
Expiring tax benefits. The following
|
||||
benefits are scheduled to expire and
|
||||
will not apply for 2008.
|
||||
• Deduction for educator expenses in
|
||||
figuring adjusted gross income.
|
||||
• The exclusion from income of
|
||||
qualified charitable deductions.
|
||||
• Credit for nonbusiness energy
|
||||
property.
|
||||
• District of Columbia first-time
|
||||
homebuyer credit (for homes
|
||||
purchased after 2007).
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -1,113 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:595px; height:842px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:87px; top:91px; width:155px; height:12px;"><span style="font-family: Ryumin-Light; font-size:12px">平成 </span><span style="font-family: GMALPM+DFHSMincho-W3G014; font-size:9px">™— </span><span style="font-family: Ryumin-Light; font-size:12px">年 </span><span style="font-family: GMALPM+DFHSMincho-W3G014; font-size:9px">› </span><span style="font-family: Ryumin-Light; font-size:12px">月 </span><span style="font-family: GMALPM+DFHSMincho-W3G014; font-size:9px">™œ </span><span style="font-family: Ryumin-Light; font-size:12px">日 金曜日
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:267px; top:89px; width:12px; height:14px;"><span style="font-family: Ryumin-Light; font-size:14px">官
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:315px; top:89px; width:12px; height:14px;"><span style="font-family: Ryumin-Light; font-size:14px">報
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:392px; top:91px; width:65px; height:12px;"><span style="font-family: Ryumin-Light; font-size:12px">第 </span><span style="font-family: GMALPM+DFHSMincho-W3G014; font-size:9px">›Ÿ˜ </span><span style="font-family: Ryumin-Light; font-size:12px">号
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:527px; top:93px; width:10px; height:9px;"><span style="font-family: GMALPM+DFHSMincho-W3G014; font-size:9px">›
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:540px; top:110px; width:8px; height:65px;"><span style="font-family: GothicBBB-Medium; font-size:9px">政令第百四十九号
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:530px; top:134px; width:8px; height:145px;"><span style="font-family: Ryumin-Light; font-size:9px">道路交通法施行令の一部を改正する政令
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:464px; top:110px; width:64px; height:361px;"><span style="font-family: Ryumin-Light; font-size:9px">内閣は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路交通法の一部を改正する法律</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">平成十九年法律第九十号</span><span style="font-family: Ryumin-Light; font-size:9px">)</span><span style="font-family: Ryumin-Light; font-size:9px">の一部の施行に伴い</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">並び
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">に道路交通法</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">昭和三十五年法律第百五号</span><span style="font-family: Ryumin-Light; font-size:9px">)</span><span style="font-family: Ryumin-Light; font-size:9px">第四条第一項及び第四項</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第五</span><span style="font-family: Ryumin-Light; font-size:9px">条第一項</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第三十九条第
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">一項</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条第九項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二項</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同法第七十二条の二第三項及び</span><span style="font-family: Ryumin-Light; font-size:9px">第七十五条の八第二項に
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">おいて準用する場合を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)、</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条の三第一項</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第六十三条の四第一</span><span style="font-family: Ryumin-Light; font-size:9px">項第二号</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第七十一条の
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">三第二項ただし書</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第七十一条の六第一項</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第九十条第一項ただし書</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第百</span><span style="font-family: Ryumin-Light; font-size:9px">条の二第一項本文及び第
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">四号</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第百二条の二並びに第百二十五条第一項及び第三項の規定に基づき</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">この政令を制定する</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:442px; top:118px; width:20px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">道路交通法施行令</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">昭和三十五年政令第二百七十号</span><span style="font-family: Ryumin-Light; font-size:9px">)</span><span style="font-family: Ryumin-Light; font-size:9px">の一部を次のように</span><span style="font-family: Ryumin-Light; font-size:9px">改正する</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第一条の二第四項第三号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">一・五メ</span><span style="font-family: Ryumin-Light; font-size:9px">ー</span><span style="font-family: Ryumin-Light; font-size:9px">トル</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">一メ</span><span style="font-family: Ryumin-Light; font-size:9px">ー</span><span style="font-family: Ryumin-Light; font-size:9px">トル</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同条第五項第三号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:431px; top:110px; width:9px; height:247px;"><span style="font-family: Ryumin-Light; font-size:9px">六十三条の四第一項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第六十三条の四第一項第一号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改める</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:420px; top:118px; width:9px; height:351px;"><span style="font-family: Ryumin-Light; font-size:9px">第二条第一項の表の青色の灯火の項第三号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">含む</span><span style="font-family: Ryumin-Light; font-size:9px">。</span><span style="font-family: Ryumin-Light; font-size:9px">青色の</span><span style="font-family: Ryumin-Light; font-size:9px">灯火の矢印の項を除き</span><span style="font-family: Ryumin-Light; font-size:9px">、
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:409px; top:274px; width:9px; height:8px;"><span style="font-family: Ryumin-Light; font-size:9px">「
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:404px; top:286px; width:9px; height:143px;"><span style="font-family: Ryumin-Light; font-size:9px">歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">進行することが</span><span style="font-family: Ryumin-Light; font-size:9px">できること</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:383px; top:110px; width:9px; height:169px;"><span style="font-family: Ryumin-Light; font-size:9px">以下この条において同じ</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同表中
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:379px; top:286px; width:17px; height:185px;"><span style="font-family: Ryumin-Light; font-size:9px">歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路の横断を始</span><span style="font-family: Ryumin-Light; font-size:9px">めてはならず</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">また</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">横断を終わるか</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">又は横断</span><span style="font-family: Ryumin-Light; font-size:9px">をやめて引き返さなけれ
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:456px; top:478px; width:93px; height:361px;"><span style="font-family: Ryumin-Light; font-size:9px">一の五 医療機関が</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">傷病者の緊急搬送を</span><span style="font-family: Ryumin-Light; font-size:9px">しようとする都道府県又は市町村の</span><span style="font-family: Ryumin-Light; font-size:9px">要請を受けて</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">当該
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">傷病者が医療機関に緊急搬送をされるま</span><span style="font-family: Ryumin-Light; font-size:9px">での間における応急の治療を行う医</span><span style="font-family: Ryumin-Light; font-size:9px">師を当該傷病者の所
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">在する場所にまで運搬するために使用す</span><span style="font-family: Ryumin-Light; font-size:9px">る自動車
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第十六条中第二号を削り</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第三号を第二号</span><span style="font-family: Ryumin-Light; font-size:9px">とする</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第十六条の二及び第十六条の三中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第五十</span><span style="font-family: Ryumin-Light; font-size:9px">一条第十一項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条第十</span><span style="font-family: Ryumin-Light; font-size:9px">二項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改める</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第十六条の五中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条第二十項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条第二十一項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改める</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第十七条中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条第二十一項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第五十一条第二十二項</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、「「</span><span style="font-family: Ryumin-Light; font-size:9px">前</span><span style="font-family: Ryumin-Light; font-size:9px">号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">とあるのは</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">前
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">号の公示に係る積載物のうち特に貴重と認め</span><span style="font-family: Ryumin-Light; font-size:9px">られるものについては</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">と</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同条第三号中</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を削
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">る</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:435px; top:486px; width:20px; height:128px;"><span style="font-family: Ryumin-Light; font-size:9px">第十七条の二を次のように改める</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br>(</span><span style="font-family: Ryumin-Light; font-size:9px">委託することのできない事務</span><span style="font-family: Ryumin-Light; font-size:9px">)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:424px; top:478px; width:9px; height:327px;"><span style="font-family: GothicBBB-Medium; font-size:9px">第十七条の二 </span><span style="font-family: Ryumin-Light; font-size:9px">法第五十一条の三第一項の政</span><span style="font-family: Ryumin-Light; font-size:9px">令で定めるものは</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">次に掲げるとお</span><span style="font-family: Ryumin-Light; font-size:9px">りとする</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:403px; top:486px; width:19px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">一 法第五十一条第五項の規定による車両</span><span style="font-family: Ryumin-Light; font-size:9px">の移動の決定
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">二 法第五十一条第六項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二項</span><span style="font-family: Ryumin-Light; font-size:9px">において準用する場合を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の規</span><span style="font-family: Ryumin-Light; font-size:9px">定により保管した車
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:392px; top:494px; width:9px; height:217px;"><span style="font-family: Ryumin-Light; font-size:9px">両</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">積載物を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。</span><span style="font-family: Ryumin-Light; font-size:9px">以下この条において</span><span style="font-family: Ryumin-Light; font-size:9px">同じ</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の返還の決定
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:382px; top:486px; width:9px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">三 法第五十一条第七項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二項</span><span style="font-family: Ryumin-Light; font-size:9px">において読み替えて準用する場合を</span><span style="font-family: Ryumin-Light; font-size:9px">含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">又は第八項の
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:372px; top:494px; width:8px; height:57px;"><span style="font-family: Ryumin-Light; font-size:9px">規定による告知
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:362px; top:286px; width:9px; height:159px;"><span style="font-family: Ryumin-Light; font-size:9px">歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路を横断して</span><span style="font-family: Ryumin-Light; font-size:9px">はならないこと</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:361px; top:486px; width:9px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">四 法第五十一条第九項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二項</span><span style="font-family: Ryumin-Light; font-size:9px">において読み替えて準用する場合を</span><span style="font-family: Ryumin-Light; font-size:9px">含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の規定による
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:346px; top:298px; width:9px; height:8px;"><span style="font-family: Ryumin-Light; font-size:9px">「
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:321px; top:310px; width:29px; height:161px;"><span style="font-family: Ryumin-Light; font-size:9px">一 歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">進行</span><span style="font-family: Ryumin-Light; font-size:9px">することができること</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">二 普通自転車</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">法第六十三条の三に規定す
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">号において同じ</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">横断歩道において直
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:294px; top:110px; width:17px; height:169px;"><span style="font-family: Ryumin-Light; font-size:9px">路を横断している歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">すみやかに</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">その
|
||||
<br>ばならないこと</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:299px; top:290px; width:8px; height:9px;"><span style="font-family: Ryumin-Light; font-size:9px">を
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:272px; top:282px; width:9px; height:8px;"><span style="font-family: Ryumin-Light; font-size:9px">」
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:285px; top:310px; width:28px; height:163px;"><span style="font-family: Ryumin-Light; font-size:9px">一 歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路の</span><span style="font-family: Ryumin-Light; font-size:9px">横断を始めてはならず</span><span style="font-family: Ryumin-Light; font-size:9px">、
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">横断を終わるか</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">又は横断をやめて引き返
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">二 横断歩道を進行</span><span style="font-family: Ryumin-Light; font-size:9px">しようとする普通自転車
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:256px; top:310px; width:20px; height:161px;"><span style="font-family: Ryumin-Light; font-size:9px">一 歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路</span><span style="font-family: Ryumin-Light; font-size:9px">を横断してはならないこ
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">二 横断歩道を進行</span><span style="font-family: Ryumin-Light; font-size:9px">しようとする普通自転車
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:214px; top:110px; width:17px; height:193px;"><span style="font-family: Ryumin-Light; font-size:9px">る普通自転車をいう</span><span style="font-family: Ryumin-Light; font-size:9px">。</span><span style="font-family: Ryumin-Light; font-size:9px">以下この条及び第二十六条第三
|
||||
<br>進をし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">又は左折することができること</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:177px; top:110px; width:29px; height:193px;"><span style="font-family: Ryumin-Light; font-size:9px">また</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路を横断している歩行者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">速やかに</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">その
|
||||
<br>さなければならないこと</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路の横断を始めてはならないこと</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:191px; top:310px; width:9px; height:161px;"><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同条第四項</span><span style="font-family: Ryumin-Light; font-size:9px">の表の人の形の記号を有
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:148px; top:110px; width:21px; height:151px;"><span style="font-family: Ryumin-Light; font-size:9px">と</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">道路の横断を始めてはならないこと</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:143px; top:306px; width:9px; height:8px;"><span style="font-family: Ryumin-Light; font-size:9px">」
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:121px; top:110px; width:20px; height:361px;"><span style="font-family: Ryumin-Light; font-size:9px">する青色の灯火の項第二号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">直進</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">右折しようとして右折する地点まで直</span><span style="font-family: Ryumin-Light; font-size:9px">進し</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">その地点において
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">右折することを含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">し</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">直進をし</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改める</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:45px; top:110px; width:74px; height:361px;"><span style="font-family: Ryumin-Light; font-size:9px">第三条の二第一項中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">行なわせる</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">行わせる</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に</span><span style="font-family: Ryumin-Light; font-size:9px">、「</span><span style="font-family: Ryumin-Light; font-size:9px">次の各号に</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">次に</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に</span><span style="font-family: Ryumin-Light; font-size:9px">、「</span><span style="font-family: Ryumin-Light; font-size:9px">こえない</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">超えない</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第十号を第十二号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第四号から第九号までを二号</span><span style="font-family: Ryumin-Light; font-size:9px">ずつ繰り下げ</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第三号を
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第四号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号の次に次の一号を加える</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">五 法第二十五条の二第二項の道路標識等
|
||||
<br>第三条の二第一項第二号の次に次の一号を加える</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">三 法第十三条第二項の道路標識等
|
||||
<br>第十三条第一項中第一号の五を第一号の六とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第一号の四の次に次の一</span><span style="font-family: Ryumin-Light; font-size:9px">号を加える</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:351px; top:494px; width:8px; height:17px;"><span style="font-family: Ryumin-Light; font-size:9px">公示
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:340px; top:486px; width:9px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">五 法第五十一条第十項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二項</span><span style="font-family: Ryumin-Light; font-size:9px">において準用する場合を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の規</span><span style="font-family: Ryumin-Light; font-size:9px">定による公示の日付
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:330px; top:494px; width:8px; height:57px;"><span style="font-family: Ryumin-Light; font-size:9px">及び内容の公表
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:319px; top:486px; width:9px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">六 法第五十一条第十二項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二</span><span style="font-family: Ryumin-Light; font-size:9px">項において読み替えて準用する場合</span><span style="font-family: Ryumin-Light; font-size:9px">を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の規定によ
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:309px; top:494px; width:8px; height:73px;"><span style="font-family: Ryumin-Light; font-size:9px">る車両の売却の決定
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:298px; top:486px; width:9px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">七 法第五十一条第十三項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二</span><span style="font-family: Ryumin-Light; font-size:9px">項において準用する場合を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の</span><span style="font-family: Ryumin-Light; font-size:9px">規定による車両の廃
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:288px; top:494px; width:8px; height:33px;"><span style="font-family: Ryumin-Light; font-size:9px">棄の決定
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:277px; top:486px; width:9px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">八 法第五十一条第十六項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二</span><span style="font-family: Ryumin-Light; font-size:9px">項において読み替えて準用する場合</span><span style="font-family: Ryumin-Light; font-size:9px">を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の規定によ
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:267px; top:494px; width:8px; height:25px;"><span style="font-family: Ryumin-Light; font-size:9px">る命令
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:224px; top:486px; width:41px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">九 法第五十一条第十七項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二</span><span style="font-family: Ryumin-Light; font-size:9px">項において準用する場合を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の</span><span style="font-family: Ryumin-Light; font-size:9px">規定による督促
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">十 法第五十一条第十八項</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二十二</span><span style="font-family: Ryumin-Light; font-size:9px">項において準用する場合を含む</span><span style="font-family: Ryumin-Light; font-size:9px">。)</span><span style="font-family: Ryumin-Light; font-size:9px">の</span><span style="font-family: Ryumin-Light; font-size:9px">規定による徴収
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">十一 法第五十一条第二十一項の規定によ</span><span style="font-family: Ryumin-Light; font-size:9px">る登録の嘱託
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第十七条の三を削り</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第十七条の四を第十</span><span style="font-family: Ryumin-Light; font-size:9px">七条の三とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第十七条の五から第</span><span style="font-family: Ryumin-Light; font-size:9px">十七条の八までを一
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:213px; top:478px; width:9px; height:71px;"><span style="font-family: Ryumin-Light; font-size:9px">条ずつ繰り上げる</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:171px; top:486px; width:41px; height:280px;"><span style="font-family: Ryumin-Light; font-size:9px">第二十二条第一号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">乗車装置</span><span style="font-family: Ryumin-Light; font-size:9px">(</span><span style="font-family: Ryumin-Light; font-size:9px">以下</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">の</span><span style="font-family: Ryumin-Light; font-size:9px">下に</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">この条において</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を加える</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第二十四条の二中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第二十六条</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">第二</span><span style="font-family: Ryumin-Light; font-size:9px">十五条の二</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改める</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第二十六条を第二十五条の二とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">第三章</span><span style="font-family: Ryumin-Light; font-size:9px">中同条の次に次の一条を加える</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br>(</span><span style="font-family: Ryumin-Light; font-size:9px">普通自転車により歩道を通行することが</span><span style="font-family: Ryumin-Light; font-size:9px">できる者</span><span style="font-family: Ryumin-Light; font-size:9px">)
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:161px; top:478px; width:9px; height:335px;"><span style="font-family: GothicBBB-Medium; font-size:9px">第二十六条 </span><span style="font-family: Ryumin-Light; font-size:9px">法第六十三条の四第一項第二号</span><span style="font-family: Ryumin-Light; font-size:9px">の政令で定める者は</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">次に掲げると</span><span style="font-family: Ryumin-Light; font-size:9px">おりとする</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:130px; top:486px; width:29px; height:353px;"><span style="font-family: Ryumin-Light; font-size:9px">一 児童及び幼児
|
||||
<br>二 七十歳以上の者
|
||||
<br>三 普通自転車により安全に車道を通行す</span><span style="font-family: Ryumin-Light; font-size:9px">ることに支障を生ずる程度の身体の</span><span style="font-family: Ryumin-Light; font-size:9px">障害として内閣府令
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:119px; top:494px; width:8px; height:89px;"><span style="font-family: Ryumin-Light; font-size:9px">で定めるものを有する者
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:45px; top:478px; width:72px; height:361px;"><span style="font-family: Ryumin-Light; font-size:9px">第二十六条の三の二第一項第四号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">次項</span><span style="font-family: Ryumin-Light; font-size:9px">第三号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">次項第四号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同項第七号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">次項
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">第六号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">次項第七号</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同条第二</span><span style="font-family: Ryumin-Light; font-size:9px">項第七号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改</span><span style="font-family: Ryumin-Light; font-size:9px">め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第八
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同項第六号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に</span><span style="font-family: Ryumin-Light; font-size:9px">改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第七号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同項</span><span style="font-family: Ryumin-Light; font-size:9px">第五号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第六号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同</span><span style="font-family: Ryumin-Light; font-size:9px">項第四号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改</span><span style="font-family: Ryumin-Light; font-size:9px">め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第五
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同項第三号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に</span><span style="font-family: Ryumin-Light; font-size:9px">改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第四号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同項</span><span style="font-family: Ryumin-Light; font-size:9px">第二号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第三号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同</span><span style="font-family: Ryumin-Light; font-size:9px">項第一号中</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">の横</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">を</span><span style="font-family: Ryumin-Light; font-size:9px">「</span><span style="font-family: Ryumin-Light; font-size:9px">以外</span><span style="font-family: Ryumin-Light; font-size:9px">」</span><span style="font-family: Ryumin-Light; font-size:9px">に改</span><span style="font-family: Ryumin-Light; font-size:9px">め</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同号を同項第二
|
||||
<br></span><span style="font-family: Ryumin-Light; font-size:9px">号とし</span><span style="font-family: Ryumin-Light; font-size:9px">、</span><span style="font-family: Ryumin-Light; font-size:9px">同項に第一号として次の一号を加え</span><span style="font-family: Ryumin-Light; font-size:9px">る</span><span style="font-family: Ryumin-Light; font-size:9px">。
|
||||
<br></span></div><span style="position:absolute; border: black 1px solid; left:144px; top:111px; width:273px; height:360px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:46px; top:475px; width:503px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:37px; top:107px; width:520px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:37px; top:843px; width:520px; height:0px;"></span>
|
||||
<div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,157 +0,0 @@
|
|||
平成 ™— 年 › 月 ™œ 日 金曜日
|
||||
|
||||
官
|
||||
|
||||
報
|
||||
|
||||
第 ›Ÿ˜ 号
|
||||
|
||||
›
|
||||
|
||||
政令第百四十九号
|
||||
|
||||
道路交通法施行令の一部を改正する政令
|
||||
|
||||
内閣は、道路交通法の一部を改正する法律(平成十九年法律第九十号)の一部の施行に伴い、並び
|
||||
に道路交通法(昭和三十五年法律第百五号)第四条第一項及び第四項、第五条第一項、第三十九条第
|
||||
一項、第五十一条第九項(同条第二十二項、同法第七十二条の二第三項及び第七十五条の八第二項に
|
||||
おいて準用する場合を含む。)、第五十一条の三第一項、第六十三条の四第一項第二号、第七十一条の
|
||||
三第二項ただし書、第七十一条の六第一項、第九十条第一項ただし書、第百条の二第一項本文及び第
|
||||
四号、第百二条の二並びに第百二十五条第一項及び第三項の規定に基づき、この政令を制定する。
|
||||
|
||||
道路交通法施行令(昭和三十五年政令第二百七十号)の一部を次のように改正する。
|
||||
第一条の二第四項第三号中「一・五メートル」を「一メートル」に改め、同条第五項第三号中「第
|
||||
|
||||
六十三条の四第一項」を「第六十三条の四第一項第一号」に改める。
|
||||
|
||||
第二条第一項の表の青色の灯火の項第三号中「含む。)」を「含む。青色の灯火の矢印の項を除き、
|
||||
|
||||
「
|
||||
|
||||
歩行者は、進行することができること。
|
||||
|
||||
以下この条において同じ。)を」に改め、同表中
|
||||
|
||||
歩行者は、道路の横断を始めてはならず、また、道
|
||||
横断を終わるか、又は横断をやめて引き返さなけれ
|
||||
|
||||
一の五 医療機関が、傷病者の緊急搬送をしようとする都道府県又は市町村の要請を受けて、当該
|
||||
傷病者が医療機関に緊急搬送をされるまでの間における応急の治療を行う医師を当該傷病者の所
|
||||
在する場所にまで運搬するために使用する自動車
|
||||
第十六条中第二号を削り、第三号を第二号とする。
|
||||
第十六条の二及び第十六条の三中「第五十一条第十一項」を「第五十一条第十二項」に改める。
|
||||
第十六条の五中「第五十一条第二十項」を「第五十一条第二十一項」に改める。
|
||||
第十七条中「第五十一条第二十一項」を「第五十一条第二十二項」に改め、「「前号」とあるのは「前
|
||||
号の公示に係る積載物のうち特に貴重と認められるものについては、同号」と、同条第三号中」を削
|
||||
る。
|
||||
|
||||
第十七条の二を次のように改める。
|
||||
(委託することのできない事務)
|
||||
|
||||
第十七条の二 法第五十一条の三第一項の政令で定めるものは、次に掲げるとおりとする。
|
||||
|
||||
一 法第五十一条第五項の規定による車両の移動の決定
|
||||
二 法第五十一条第六項(同条第二十二項において準用する場合を含む。)の規定により保管した車
|
||||
|
||||
両(積載物を含む。以下この条において同じ。)の返還の決定
|
||||
|
||||
三 法第五十一条第七項(同条第二十二項において読み替えて準用する場合を含む。)又は第八項の
|
||||
|
||||
規定による告知
|
||||
|
||||
歩行者は、道路を横断してはならないこと。
|
||||
|
||||
四 法第五十一条第九項(同条第二十二項において読み替えて準用する場合を含む。)の規定による
|
||||
|
||||
「
|
||||
|
||||
一 歩行者は、進行することができること。
|
||||
二 普通自転車(法第六十三条の三に規定す
|
||||
号において同じ。)は、横断歩道において直
|
||||
|
||||
路を横断している歩行者は、すみやかに、その
|
||||
ばならないこと。
|
||||
|
||||
を
|
||||
|
||||
」
|
||||
|
||||
一 歩行者は、道路の横断を始めてはならず、
|
||||
横断を終わるか、又は横断をやめて引き返
|
||||
二 横断歩道を進行しようとする普通自転車
|
||||
|
||||
一 歩行者は、道路を横断してはならないこ
|
||||
二 横断歩道を進行しようとする普通自転車
|
||||
|
||||
る普通自転車をいう。以下この条及び第二十六条第三
|
||||
進をし、又は左折することができること。
|
||||
|
||||
また、道路を横断している歩行者は、速やかに、その
|
||||
さなければならないこと。
|
||||
は、道路の横断を始めてはならないこと。
|
||||
|
||||
に改め、同条第四項の表の人の形の記号を有
|
||||
|
||||
と。
|
||||
は、道路の横断を始めてはならないこと。
|
||||
|
||||
」
|
||||
|
||||
する青色の灯火の項第二号中「直進(右折しようとして右折する地点まで直進し、その地点において
|
||||
右折することを含む。)し」を「直進をし」に改める。
|
||||
|
||||
第三条の二第一項中「行なわせる」を「行わせる」に、「次の各号に」を「次に」に、「こえない」を
|
||||
「超えない」に改め、第十号を第十二号とし、第四号から第九号までを二号ずつ繰り下げ、第三号を
|
||||
第四号とし、同号の次に次の一号を加える。
|
||||
五 法第二十五条の二第二項の道路標識等
|
||||
第三条の二第一項第二号の次に次の一号を加える。
|
||||
三 法第十三条第二項の道路標識等
|
||||
第十三条第一項中第一号の五を第一号の六とし、第一号の四の次に次の一号を加える。
|
||||
|
||||
公示
|
||||
|
||||
五 法第五十一条第十項(同条第二十二項において準用する場合を含む。)の規定による公示の日付
|
||||
|
||||
及び内容の公表
|
||||
|
||||
六 法第五十一条第十二項(同条第二十二項において読み替えて準用する場合を含む。)の規定によ
|
||||
|
||||
る車両の売却の決定
|
||||
|
||||
七 法第五十一条第十三項(同条第二十二項において準用する場合を含む。)の規定による車両の廃
|
||||
|
||||
棄の決定
|
||||
|
||||
八 法第五十一条第十六項(同条第二十二項において読み替えて準用する場合を含む。)の規定によ
|
||||
|
||||
る命令
|
||||
|
||||
九 法第五十一条第十七項(同条第二十二項において準用する場合を含む。)の規定による督促
|
||||
十 法第五十一条第十八項(同条第二十二項において準用する場合を含む。)の規定による徴収
|
||||
十一 法第五十一条第二十一項の規定による登録の嘱託
|
||||
第十七条の三を削り、第十七条の四を第十七条の三とし、第十七条の五から第十七条の八までを一
|
||||
|
||||
条ずつ繰り上げる。
|
||||
|
||||
第二十二条第一号中「乗車装置(以下」の下に「この条において」を加える。
|
||||
第二十四条の二中「第二十六条」を「第二十五条の二」に改める。
|
||||
第二十六条を第二十五条の二とし、第三章中同条の次に次の一条を加える。
|
||||
(普通自転車により歩道を通行することができる者)
|
||||
|
||||
第二十六条 法第六十三条の四第一項第二号の政令で定める者は、次に掲げるとおりとする。
|
||||
|
||||
一 児童及び幼児
|
||||
二 七十歳以上の者
|
||||
三 普通自転車により安全に車道を通行することに支障を生ずる程度の身体の障害として内閣府令
|
||||
|
||||
で定めるものを有する者
|
||||
|
||||
第二十六条の三の二第一項第四号中「次項第三号」を「次項第四号」に改め、同項第七号中「次項
|
||||
第六号」を「次項第七号」に改め、同条第二項第七号中「の横」を「以外」に改め、同号を同項第八
|
||||
号とし、同項第六号中「の横」を「以外」に改め、同号を同項第七号とし、同項第五号中「の横」を
|
||||
「以外」に改め、同号を同項第六号とし、同項第四号中「の横」を「以外」に改め、同号を同項第五
|
||||
号とし、同項第三号中「の横」を「以外」に改め、同号を同項第四号とし、同項第二号中「の横」を
|
||||
「以外」に改め、同号を同項第三号とし、同項第一号中「の横」を「以外」に改め、同号を同項第二
|
||||
号とし、同項に第一号として次の一号を加える。
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -1,83 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:595px; height:842px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:79px; top:125px; width:452px; height:18px;"><span style="font-family: MZSZGI+NimbusRomNo9L-Medi; font-size:18px">Preemptive Information Extraction using Unrestricted Relation Discovery
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:148px; top:164px; width:91px; height:15px;"><span style="font-family: MZSZGI+NimbusRomNo9L-Medi; font-size:15px">Yusuke Shinyama
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:389px; top:164px; width:74px; height:15px;"><span style="font-family: MZSZGI+NimbusRomNo9L-Medi; font-size:15px">Satoshi Sekine
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:255px; top:187px; width:101px; height:14px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:14px">New York University
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:244px; top:201px; width:122px; height:14px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:14px">715, Broadway, 7th Floor
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:252px; top:215px; width:106px; height:14px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:14px">New York, NY, 10003
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:222px; top:224px; width:168px; height:18px;"><span style="font-family: ZNQAHA+CMSY10; font-size:18px">{</span><span style="font-family: CXOZYQ+NimbusMonL-Regu; font-size:11px">yusuke,sekine</span><span style="font-family: ZNQAHA+CMSY10; font-size:18px">}</span><span style="font-family: CXOZYQ+NimbusMonL-Regu; font-size:11px">@cs.nyu.edu
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:163px; top:281px; width:44px; height:15px;"><span style="font-family: MZSZGI+NimbusRomNo9L-Medi; font-size:15px">Abstract
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:93px; top:310px; width:183px; height:175px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">We are trying to extend the boundary of
|
||||
<br>Information Extraction (IE) systems. Ex-
|
||||
<br>isting IE systems require a lot of time and
|
||||
<br>human effort to tune for a new scenario.
|
||||
<br>Preemptive Information Extraction is an
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">attempt to automatically create all feasible
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">IE systems in advance without human in-
|
||||
<br>tervention. We propose a technique called
|
||||
<br>Unrestricted Relation Discovery that dis-
|
||||
<br>covers all possible relations from texts and
|
||||
<br>presents them as tables. We present a pre-
|
||||
<br>liminary system that obtains reasonably
|
||||
<br>good results.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:71px; top:505px; width:80px; height:15px;"><span style="font-family: MZSZGI+NimbusRomNo9L-Medi; font-size:15px">1 Background
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:71px; top:528px; width:226px; height:243px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">Every day, a large number of news articles are cre-
|
||||
<br>ated and reported, many of which are unique. But
|
||||
<br>certain types of events, such as hurricanes or mur-
|
||||
<br>ders, are reported again and again throughout a year.
|
||||
<br>The goal of Information Extraction, or IE, is to re-
|
||||
<br>trieve a certain type of news event from past articles
|
||||
<br>and present the events as a table whose columns are
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">filled with a name of a person or company, accord-
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">ing to its role in the event. However, existing IE
|
||||
<br>techniques require a lot of human labor. First, you
|
||||
<br>have to specify the type of information you want and
|
||||
<br>collect articles that include this information. Then,
|
||||
<br>you have to analyze the articles and manually craft
|
||||
<br>a set of patterns to capture these events. Most exist-
|
||||
<br>ing IE research focuses on reducing this burden by
|
||||
<br>helping people create such patterns. But each time
|
||||
<br>you want to extract a different kind of information,
|
||||
<br>you need to repeat the whole process: specify arti-
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:313px; top:284px; width:226px; height:94px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">cles and adjust its patterns, either manually or semi-
|
||||
<br>automatically. There is a bit of a dangerous pitfall
|
||||
<br>here. First, it is hard to estimate how good the sys-
|
||||
<br>tem can be after months of work. Furthermore, you
|
||||
<br>might not know if the task is even doable in the first
|
||||
<br>place. Knowing what kind of information is easily
|
||||
<br>obtained in advance would help reduce this risk.
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:313px; top:379px; width:226px; height:175px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">An IE task can be defined as finding a relation
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">among several entities involved in a certain type of
|
||||
<br>event. For example, in the MUC-6 management
|
||||
<br>succession scenario, one seeks a relation between
|
||||
<br>COMPANY, PERSON and POST involved with hir-
|
||||
<br>ing/firing events. For each row of an extracted ta-
|
||||
<br>ble, you can always read it as “COMPANY hired
|
||||
<br>(or fired) PERSON for POST.” The relation between
|
||||
<br>these entities is retained throughout the table. There
|
||||
<br>are many existing works on obtaining extraction pat-
|
||||
<br>terns for pre-defined relations (Riloff, 1996; Yangar-
|
||||
<br>ber et al., 2000; Agichtein and Gravano, 2000; Sudo
|
||||
<br>et al., 2003).
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:313px; top:555px; width:226px; height:216px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">Unrestricted Relation Discovery is a technique to
|
||||
<br>automatically discover such relations that repeatedly
|
||||
<br>appear in a corpus and present them as a table, with
|
||||
<br>absolutely no human intervention. Unlike most ex-
|
||||
<br>isting IE research, a user does not specify the type
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">of articles or information wanted. Instead, a system
|
||||
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:13px">tries to find all the kinds of relations that are reported
|
||||
<br>multiple times and can be reported in tabular form.
|
||||
<br>This technique will open up the possibility of try-
|
||||
<br>ing new IE scenarios. Furthermore, the system itself
|
||||
<br>can be used as an IE system, since an obtained re-
|
||||
<br>lation is already presented as a table. If this system
|
||||
<br>works to a certain extent, tuning an IE system be-
|
||||
<br>comes a search problem: all the tables are already
|
||||
<br>built “preemptively.” A user only needs to search
|
||||
<br>for a relevant table.
|
||||
<br></span></div><div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,91 +0,0 @@
|
|||
Preemptive Information Extraction using Unrestricted Relation Discovery
|
||||
|
||||
Yusuke Shinyama
|
||||
|
||||
Satoshi Sekine
|
||||
|
||||
New York University
|
||||
|
||||
715, Broadway, 7th Floor
|
||||
|
||||
New York, NY, 10003
|
||||
|
||||
{yusuke,sekine}@cs.nyu.edu
|
||||
|
||||
Abstract
|
||||
|
||||
We are trying to extend the boundary of
|
||||
Information Extraction (IE) systems. Ex-
|
||||
isting IE systems require a lot of time and
|
||||
human effort to tune for a new scenario.
|
||||
Preemptive Information Extraction is an
|
||||
attempt to automatically create all feasible
|
||||
IE systems in advance without human in-
|
||||
tervention. We propose a technique called
|
||||
Unrestricted Relation Discovery that dis-
|
||||
covers all possible relations from texts and
|
||||
presents them as tables. We present a pre-
|
||||
liminary system that obtains reasonably
|
||||
good results.
|
||||
|
||||
1 Background
|
||||
|
||||
Every day, a large number of news articles are cre-
|
||||
ated and reported, many of which are unique. But
|
||||
certain types of events, such as hurricanes or mur-
|
||||
ders, are reported again and again throughout a year.
|
||||
The goal of Information Extraction, or IE, is to re-
|
||||
trieve a certain type of news event from past articles
|
||||
and present the events as a table whose columns are
|
||||
filled with a name of a person or company, accord-
|
||||
ing to its role in the event. However, existing IE
|
||||
techniques require a lot of human labor. First, you
|
||||
have to specify the type of information you want and
|
||||
collect articles that include this information. Then,
|
||||
you have to analyze the articles and manually craft
|
||||
a set of patterns to capture these events. Most exist-
|
||||
ing IE research focuses on reducing this burden by
|
||||
helping people create such patterns. But each time
|
||||
you want to extract a different kind of information,
|
||||
you need to repeat the whole process: specify arti-
|
||||
|
||||
cles and adjust its patterns, either manually or semi-
|
||||
automatically. There is a bit of a dangerous pitfall
|
||||
here. First, it is hard to estimate how good the sys-
|
||||
tem can be after months of work. Furthermore, you
|
||||
might not know if the task is even doable in the first
|
||||
place. Knowing what kind of information is easily
|
||||
obtained in advance would help reduce this risk.
|
||||
|
||||
An IE task can be defined as finding a relation
|
||||
among several entities involved in a certain type of
|
||||
event. For example, in the MUC-6 management
|
||||
succession scenario, one seeks a relation between
|
||||
COMPANY, PERSON and POST involved with hir-
|
||||
ing/firing events. For each row of an extracted ta-
|
||||
ble, you can always read it as “COMPANY hired
|
||||
(or fired) PERSON for POST.” The relation between
|
||||
these entities is retained throughout the table. There
|
||||
are many existing works on obtaining extraction pat-
|
||||
terns for pre-defined relations (Riloff, 1996; Yangar-
|
||||
ber et al., 2000; Agichtein and Gravano, 2000; Sudo
|
||||
et al., 2003).
|
||||
|
||||
Unrestricted Relation Discovery is a technique to
|
||||
automatically discover such relations that repeatedly
|
||||
appear in a corpus and present them as a table, with
|
||||
absolutely no human intervention. Unlike most ex-
|
||||
isting IE research, a user does not specify the type
|
||||
of articles or information wanted. Instead, a system
|
||||
tries to find all the kinds of relations that are reported
|
||||
multiple times and can be reported in tabular form.
|
||||
This technique will open up the possibility of try-
|
||||
ing new IE scenarios. Furthermore, the system itself
|
||||
can be used as an IE system, since an obtained re-
|
||||
lation is already presented as a table. If this system
|
||||
works to a certain extent, tuning an IE system be-
|
||||
comes a search problem: all the tables are already
|
||||
built “preemptively.” A user only needs to search
|
||||
for a relevant table.
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -1,15 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:800px; height:600px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:62px; top:126px; width:672px; height:157px;"><span style="font-family: DAFPJF+HiraKakuPro-W6; font-size:85px">コンパラブルな新聞記事からの
|
||||
<br>固有表現の発見
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:263px; top:374px; width:468px; height:212px;"><span style="font-family: DAFPJF+HiraKakuPro-W6; font-size:64px">新山 祐介
|
||||
<br>関根 聡
|
||||
<br></span><span style="font-family: DAFPJF+HiraKakuPro-W6; font-size:50px">Computer Science Department
|
||||
<br>New York University
|
||||
<br></span></div><span style="position:absolute; border: black 1px solid; left:0px; top:50px; width:800px; height:600px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:50px; top:308px; width:510px; height:0px;"></span>
|
||||
<div style="position:absolute; border: figure 1px solid; writing-mode:False; left:25px; top:587px; width:41px; height:40px;"></div><div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,9 +0,0 @@
|
|||
コンパラブルな新聞記事からの
|
||||
固有表現の発見
|
||||
|
||||
新山 祐介
|
||||
関根 聡
|
||||
Computer Science Department
|
||||
New York University
|
||||
|
||||
|
|
@ -1,120 +0,0 @@
|
|||
<?xml version="1.0" encoding="utf-8" ?>
|
||||
<pages>
|
||||
<page id="1" bbox="0.000,0.000,800.000,600.000" rotate="0">
|
||||
<textbox id="0" bbox="62.000,365.240,734.000,523.160">
|
||||
<textline bbox="62.000,437.240,734.000,523.160">
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="62.000,437.240,110.000,523.160" size="85.920">コ</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="110.000,437.240,158.000,523.160" size="85.920">ン</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="158.000,437.240,206.000,523.160" size="85.920">パ</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="206.000,437.240,254.000,523.160" size="85.920">ラ</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="254.000,437.240,302.000,523.160" size="85.920">ブ</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="302.000,437.240,350.000,523.160" size="85.920">ル</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="350.000,437.240,398.000,523.160" size="85.920">な</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="398.000,437.240,446.000,523.160" size="85.920">新</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="446.000,437.240,494.000,523.160" size="85.920">聞</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="494.000,437.240,542.000,523.160" size="85.920">記</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="542.000,437.240,590.000,523.160" size="85.920">事</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="590.000,437.240,638.000,523.160" size="85.920">か</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="638.000,437.240,686.000,523.160" size="85.920">ら</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="686.000,437.240,734.000,523.160" size="85.920">の</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
<textline bbox="62.000,365.240,398.000,451.160">
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="62.000,365.240,110.000,451.160" size="85.920">固</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="110.000,365.240,158.000,451.160" size="85.920">有</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="158.000,365.240,206.000,451.160" size="85.920">表</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="206.000,365.240,254.000,451.160" size="85.920">現</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="254.000,365.240,302.000,451.160" size="85.920">の</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="302.000,365.240,350.000,451.160" size="85.920">発</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="350.000,365.240,398.000,451.160" size="85.920">見</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="1" bbox="263.532,62.640,732.000,275.120">
|
||||
<textline bbox="576.012,210.680,732.000,275.120">
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="576.012,210.680,612.012,275.120" size="64.440">新</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="612.012,210.680,648.012,275.120" size="64.440">山</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="648.012,210.680,660.000,275.120" size="64.440"> </text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="660.000,210.680,696.000,275.120" size="64.440">祐</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="696.000,210.680,732.000,275.120" size="64.440">介</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
<textline bbox="612.012,154.680,732.000,219.120">
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="612.012,154.680,648.012,219.120" size="64.440">関</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="648.012,154.680,684.012,219.120" size="64.440">根</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="684.012,154.680,696.000,219.120" size="64.440"> </text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="696.000,154.680,732.000,219.120" size="64.440">聡</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
<textline bbox="263.532,106.640,732.000,156.760">
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="263.532,106.640,285.736,156.760" size="50.120">C</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="285.736,106.640,304.496,156.760" size="50.120">o</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="304.496,106.640,332.776,156.760" size="50.120">m</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="332.776,106.640,352.600,156.760" size="50.120">p</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="352.600,106.640,371.444,156.760" size="50.120">u</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="371.444,106.640,383.596,156.760" size="50.120">t</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="383.596,106.640,401.432,156.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="401.432,106.640,415.208,156.760" size="50.120">r</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="415.208,106.640,424.532,156.760" size="50.120"> </text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="424.532,106.640,444.552,156.760" size="50.120">S</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="444.552,106.640,462.052,156.760" size="50.120">c</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="462.052,106.640,469.640,156.760" size="50.120">i</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="469.640,106.640,487.476,156.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="487.476,106.640,506.320,156.760" size="50.120">n</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="506.320,106.640,523.820,156.760" size="50.120">c</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="523.820,106.640,541.656,156.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="541.656,106.640,550.980,156.760" size="50.120"> </text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="550.980,106.640,573.548,156.760" size="50.120">D</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="573.548,106.640,591.384,156.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="591.384,106.640,611.208,156.760" size="50.120">p</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="611.208,106.640,628.960,156.760" size="50.120">a</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="628.960,106.640,642.736,156.760" size="50.120">r</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="642.736,106.640,654.888,156.760" size="50.120">t</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="654.888,106.640,683.168,156.760" size="50.120">m</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="683.168,106.640,701.004,156.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="701.004,106.640,719.848,156.760" size="50.120">n</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="719.848,106.640,732.000,156.760" size="50.120">t</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
<textline bbox="424.140,62.640,732.000,112.760">
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="424.140,62.640,447.128,112.760" size="50.120">N</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="447.128,62.640,464.964,112.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="464.964,62.640,488.764,112.760" size="50.120">w</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="488.764,62.640,498.088,112.760" size="50.120"> </text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="498.088,62.640,519.312,112.760" size="50.120">Y</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="519.312,62.640,538.072,112.760" size="50.120">o</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="538.072,62.640,551.848,112.760" size="50.120">r</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="551.848,62.640,570.104,112.760" size="50.120">k</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="570.104,62.640,579.428,112.760" size="50.120"> </text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="579.428,62.640,602.640,112.760" size="50.120">U</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="602.640,62.640,621.484,112.760" size="50.120">n</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="621.484,62.640,629.072,112.760" size="50.120">i</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="629.072,62.640,646.460,112.760" size="50.120">v</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="646.460,62.640,664.296,112.760" size="50.120">e</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="664.296,62.640,678.072,112.760" size="50.120">r</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="678.072,62.640,694.676,112.760" size="50.120">s</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="694.676,62.640,702.264,112.760" size="50.120">i</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="702.264,62.640,714.416,112.760" size="50.120">t</text>
|
||||
<text font="DAFPJF+HiraKakuPro-W6" bbox="714.416,62.640,732.000,112.760" size="50.120">y</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<rect linewidth="0" bbox="0.000,0.000,800.000,600.000" />
|
||||
<line linewidth="8" bbox="50.000,342.000,560.000,342.000" />
|
||||
<figure name="Im1" bbox="25.000,23.000,66.000,63.000">
|
||||
<image width="41" height="40" />
|
||||
</figure>
|
||||
<layout>
|
||||
<textgroup bbox="62.000,62.640,734.000,523.160">
|
||||
<textbox id="0" bbox="62.000,365.240,734.000,523.160" />
|
||||
<textbox id="1" bbox="263.532,62.640,732.000,275.120" />
|
||||
</textgroup>
|
||||
</layout>
|
||||
</page>
|
||||
</pages>
|
|
@ -1,15 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:792px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:100px; top:119px; width:61px; height:27px;"><span style="font-family: Helvetica; font-size:27px">Hello
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:261px; top:119px; width:62px; height:27px;"><span style="font-family: Helvetica; font-size:27px">World
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:100px; top:219px; width:61px; height:27px;"><span style="font-family: Helvetica; font-size:27px">Hello
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:261px; top:219px; width:62px; height:27px;"><span style="font-family: Helvetica; font-size:27px">World
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:100px; top:319px; width:111px; height:27px;"><span style="font-family: Helvetica; font-size:27px">H e l l o
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:321px; top:319px; width:102px; height:27px;"><span style="font-family: Helvetica; font-size:27px">W o r l d
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:100px; top:419px; width:111px; height:27px;"><span style="font-family: Helvetica; font-size:27px">H e l l o
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:321px; top:419px; width:102px; height:27px;"><span style="font-family: Helvetica; font-size:27px">W o r l d
|
||||
<br></span></div><div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,17 +0,0 @@
|
|||
Hello
|
||||
|
||||
World
|
||||
|
||||
Hello
|
||||
|
||||
World
|
||||
|
||||
H e l l o
|
||||
|
||||
W o r l d
|
||||
|
||||
H e l l o
|
||||
|
||||
W o r l d
|
||||
|
||||
|
|
@ -1,139 +0,0 @@
|
|||
<?xml version="1.0" encoding="utf-8" ?>
|
||||
<pages>
|
||||
<page id="1" bbox="0.000,0.000,612.000,792.000" rotate="0">
|
||||
<textbox id="0" bbox="100.000,695.032,161.344,722.776">
|
||||
<textline bbox="100.000,695.032,161.344,722.776">
|
||||
<text font="Helvetica" bbox="100.000,695.032,117.328,722.776" size="27.744">H</text>
|
||||
<text font="Helvetica" bbox="117.328,695.032,130.672,722.776" size="27.744">e</text>
|
||||
<text font="Helvetica" bbox="130.672,695.032,136.000,722.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="136.000,695.032,141.328,722.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="141.328,695.032,154.672,722.776" size="27.744">o</text>
|
||||
<text font="Helvetica" bbox="154.672,695.032,161.344,722.776" size="27.744"> </text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="1" bbox="261.328,695.032,323.992,722.776">
|
||||
<textline bbox="261.328,695.032,323.992,722.776">
|
||||
<text font="Helvetica" bbox="261.328,695.032,283.984,722.776" size="27.744">W</text>
|
||||
<text font="Helvetica" bbox="283.984,695.032,297.328,722.776" size="27.744">o</text>
|
||||
<text font="Helvetica" bbox="297.328,695.032,305.320,722.776" size="27.744">r</text>
|
||||
<text font="Helvetica" bbox="305.320,695.032,310.648,722.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="310.648,695.032,323.992,722.776" size="27.744">d</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="2" bbox="100.000,595.032,161.344,622.776">
|
||||
<textline bbox="100.000,595.032,161.344,622.776">
|
||||
<text font="Helvetica" bbox="100.000,595.032,117.328,622.776" size="27.744">H</text>
|
||||
<text font="Helvetica" bbox="117.328,595.032,130.672,622.776" size="27.744">e</text>
|
||||
<text font="Helvetica" bbox="130.672,595.032,136.000,622.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="136.000,595.032,141.328,622.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="141.328,595.032,154.672,622.776" size="27.744">o</text>
|
||||
<text font="Helvetica" bbox="154.672,595.032,161.344,622.776" size="27.744"> </text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="3" bbox="261.344,595.032,324.008,622.776">
|
||||
<textline bbox="261.344,595.032,324.008,622.776">
|
||||
<text font="Helvetica" bbox="261.344,595.032,284.000,622.776" size="27.744">W</text>
|
||||
<text font="Helvetica" bbox="284.000,595.032,297.344,622.776" size="27.744">o</text>
|
||||
<text font="Helvetica" bbox="297.344,595.032,305.336,622.776" size="27.744">r</text>
|
||||
<text font="Helvetica" bbox="305.336,595.032,310.664,622.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="310.664,595.032,324.008,622.776" size="27.744">d</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="4" bbox="100.000,495.032,211.344,522.776">
|
||||
<textline bbox="100.000,495.032,211.344,522.776">
|
||||
<text font="Helvetica" bbox="100.000,495.032,117.328,522.776" size="27.744">H</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="127.328,495.032,140.672,522.776" size="27.744">e</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="150.672,495.032,156.000,522.776" size="27.744">l</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="166.000,495.032,171.328,522.776" size="27.744">l</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="181.328,495.032,194.672,522.776" size="27.744">o</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="204.672,495.032,211.344,522.776" size="27.744"> </text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="5" bbox="321.344,495.032,424.008,522.776">
|
||||
<textline bbox="321.344,495.032,424.008,522.776">
|
||||
<text font="Helvetica" bbox="321.344,495.032,344.000,522.776" size="27.744">W</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="354.000,495.032,367.344,522.776" size="27.744">o</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="377.344,495.032,385.336,522.776" size="27.744">r</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="395.336,495.032,400.664,522.776" size="27.744">l</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="410.664,495.032,424.008,522.776" size="27.744">d</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="6" bbox="100.000,395.032,211.264,422.776">
|
||||
<textline bbox="100.000,395.032,211.264,422.776">
|
||||
<text font="Helvetica" bbox="100.000,395.032,117.328,422.776" size="27.744">H</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="127.312,395.032,140.656,422.776" size="27.744">e</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="150.640,395.032,155.968,422.776" size="27.744">l</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="165.952,395.032,171.280,422.776" size="27.744">l</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="181.264,395.032,194.608,422.776" size="27.744">o</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="204.592,395.032,211.264,422.776" size="27.744"> </text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="7" bbox="321.232,395.032,423.832,422.776">
|
||||
<textline bbox="321.232,395.032,423.832,422.776">
|
||||
<text font="Helvetica" bbox="321.232,395.032,343.888,422.776" size="27.744">W</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="353.872,395.032,367.216,422.776" size="27.744">o</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="377.200,395.032,385.192,422.776" size="27.744">r</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="395.176,395.032,400.504,422.776" size="27.744">l</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="410.488,395.032,423.832,422.776" size="27.744">d</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<layout>
|
||||
<textgroup bbox="100.000,395.032,424.008,722.776">
|
||||
<textgroup bbox="100.000,595.032,324.008,722.776">
|
||||
<textgroup bbox="100.000,695.032,323.992,722.776">
|
||||
<textbox id="0" bbox="100.000,695.032,161.344,722.776" />
|
||||
<textbox id="1" bbox="261.328,695.032,323.992,722.776" />
|
||||
</textgroup>
|
||||
<textgroup bbox="100.000,595.032,324.008,622.776">
|
||||
<textbox id="2" bbox="100.000,595.032,161.344,622.776" />
|
||||
<textbox id="3" bbox="261.344,595.032,324.008,622.776" />
|
||||
</textgroup>
|
||||
</textgroup>
|
||||
<textgroup bbox="100.000,395.032,424.008,522.776">
|
||||
<textgroup bbox="100.000,495.032,424.008,522.776">
|
||||
<textbox id="4" bbox="100.000,495.032,211.344,522.776" />
|
||||
<textbox id="5" bbox="321.344,495.032,424.008,522.776" />
|
||||
</textgroup>
|
||||
<textgroup bbox="100.000,395.032,423.832,422.776">
|
||||
<textbox id="6" bbox="100.000,395.032,211.264,422.776" />
|
||||
<textbox id="7" bbox="321.232,395.032,423.832,422.776" />
|
||||
</textgroup>
|
||||
</textgroup>
|
||||
</textgroup>
|
||||
</layout>
|
||||
</page>
|
||||
</pages>
|
|
@ -1,11 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:792px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<span style="position:absolute; border: black 1px solid; left:150px; top:492px; width:0px; height:100px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:150px; top:592px; width:250px; height:0px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:200px; top:467px; width:50px; height:75px;"></span>
|
||||
<span style="position:absolute; border: black 1px solid; left:300px; top:442px; width:100px; height:100px;"></span>
|
||||
<div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1 +0,0 @@
|
|||
|
|
@ -1,9 +0,0 @@
|
|||
<?xml version="1.0" encoding="utf-8" ?>
|
||||
<pages>
|
||||
<page id="1" bbox="0.000,0.000,612.000,792.000" rotate="0">
|
||||
<line linewidth="0" bbox="150.000,250.000,150.000,350.000" />
|
||||
<line linewidth="4" bbox="150.000,250.000,400.000,250.000" />
|
||||
<rect linewidth="1" bbox="200.000,300.000,250.000,375.000" />
|
||||
<curve linewidth="1" bbox="300.000,300.000,400.000,400.000" pts="300.000,300.000,300.000,400.000,400.000,400.000,400.000,300.000"/>
|
||||
</page>
|
||||
</pages>
|
|
@ -1,11 +0,0 @@
|
|||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||||
</head><body>
|
||||
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:792px;"></span>
|
||||
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
|
||||
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:281px; top:575px; width:62px; height:27px;"><span style="font-family: Helvetica; font-size:27px">World
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:241px; top:599px; width:40px; height:27px;"><span style="font-family: Helvetica; font-size:27px">orld
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:tb-rl; left:194px; top:136px; width:48px; height:490px;"><span style="font-family: unknown; font-size:48px">あいうえおあいうえお </span><span style="font-family: Helvetica; font-size:27px">W
|
||||
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:0px; top:72px; width:218px; height:79px;"><span style="font-family: Helvetica; font-size:55px">HelloHello
|
||||
<br></span></div><div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>
|
||||
</body></html>
|
|
@ -1,9 +0,0 @@
|
|||
World
|
||||
|
||||
orld
|
||||
|
||||
あいうえおあいうえお W
|
||||
|
||||
HelloHello
|
||||
|
||||
|
|
@ -1,72 +0,0 @@
|
|||
<?xml version="1.0" encoding="utf-8" ?>
|
||||
<pages>
|
||||
<page id="1" bbox="0.000,0.000,612.000,792.000" rotate="0">
|
||||
<textbox id="0" bbox="281.352,239.032,344.016,266.776">
|
||||
<textline bbox="281.352,239.032,344.016,266.776">
|
||||
<text font="Helvetica" bbox="281.352,239.032,304.008,266.776" size="27.744">W</text>
|
||||
<text font="Helvetica" bbox="304.008,239.032,317.352,266.776" size="27.744">o</text>
|
||||
<text font="Helvetica" bbox="317.352,239.032,325.344,266.776" size="27.744">r</text>
|
||||
<text font="Helvetica" bbox="325.344,239.032,330.672,266.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="330.672,239.032,344.016,266.776" size="27.744">d</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="1" bbox="241.344,215.032,281.352,242.776">
|
||||
<textline bbox="241.344,215.032,281.352,242.776">
|
||||
<text font="Helvetica" bbox="241.344,215.032,254.688,242.776" size="27.744">o</text>
|
||||
<text font="Helvetica" bbox="254.688,215.032,262.680,242.776" size="27.744">r</text>
|
||||
<text font="Helvetica" bbox="262.680,215.032,268.008,242.776" size="27.744">l</text>
|
||||
<text font="Helvetica" bbox="268.008,215.032,281.352,242.776" size="27.744">d</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="2" bbox="194.688,215.032,242.688,705.760" wmode="vertical">
|
||||
<textline bbox="194.688,215.032,242.688,705.760">
|
||||
<text font="unknown" bbox="194.688,657.760,242.688,705.760" size="48.000">あ</text>
|
||||
<text font="unknown" bbox="194.688,609.760,242.688,657.760" size="48.000">い</text>
|
||||
<text font="unknown" bbox="194.688,561.760,242.688,609.760" size="48.000">う</text>
|
||||
<text font="unknown" bbox="194.688,513.760,242.688,561.760" size="48.000">え</text>
|
||||
<text font="unknown" bbox="194.688,465.760,242.688,513.760" size="48.000">お</text>
|
||||
<text font="unknown" bbox="194.688,441.760,242.688,489.760" size="48.000">あ</text>
|
||||
<text font="unknown" bbox="194.688,393.760,242.688,441.760" size="48.000">い</text>
|
||||
<text font="unknown" bbox="194.688,345.760,242.688,393.760" size="48.000">う</text>
|
||||
<text font="unknown" bbox="194.688,297.760,242.688,345.760" size="48.000">え</text>
|
||||
<text font="unknown" bbox="194.688,249.760,242.688,297.760" size="48.000">お</text>
|
||||
<text> </text>
|
||||
<text font="Helvetica" bbox="218.688,215.032,241.344,242.776" size="27.744">W</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<textbox id="3" bbox="0.000,690.064,218.688,769.552">
|
||||
<textline bbox="0.000,690.064,218.688,769.552">
|
||||
<text font="Helvetica" bbox="0.000,690.064,34.656,745.552" size="55.488">H</text>
|
||||
<text font="Helvetica" bbox="34.656,690.064,61.344,745.552" size="55.488">e</text>
|
||||
<text font="Helvetica" bbox="61.344,690.064,72.000,745.552" size="55.488">l</text>
|
||||
<text font="Helvetica" bbox="72.000,690.064,82.656,745.552" size="55.488">l</text>
|
||||
<text font="Helvetica" bbox="82.656,690.064,109.344,745.552" size="55.488">o</text>
|
||||
<text font="Helvetica" bbox="109.344,714.064,144.000,769.552" size="55.488">H</text>
|
||||
<text font="Helvetica" bbox="144.000,714.064,170.688,769.552" size="55.488">e</text>
|
||||
<text font="Helvetica" bbox="170.688,714.064,181.344,769.552" size="55.488">l</text>
|
||||
<text font="Helvetica" bbox="181.344,714.064,192.000,769.552" size="55.488">l</text>
|
||||
<text font="Helvetica" bbox="192.000,714.064,218.688,769.552" size="55.488">o</text>
|
||||
<text>
|
||||
</text>
|
||||
</textline>
|
||||
</textbox>
|
||||
<layout>
|
||||
<textgroup bbox="0.000,215.032,344.016,769.552">
|
||||
<textgroup bbox="241.344,215.032,344.016,266.776">
|
||||
<textbox id="0" bbox="281.352,239.032,344.016,266.776" />
|
||||
<textbox id="1" bbox="241.344,215.032,281.352,242.776" />
|
||||
</textgroup>
|
||||
<textgroup bbox="0.000,215.032,242.688,769.552">
|
||||
<textbox id="2" bbox="194.688,215.032,242.688,705.760" />
|
||||
<textbox id="3" bbox="0.000,690.064,218.688,769.552" />
|
||||
</textgroup>
|
||||
</textgroup>
|
||||
</layout>
|
||||
</page>
|
||||
</pages>
|
5
setup.py
5
setup.py
|
@ -13,7 +13,10 @@ setup(
|
|||
'six',
|
||||
'sortedcontainers',
|
||||
],
|
||||
extras_require={"dev": ["nose", "tox"]},
|
||||
extras_require={
|
||||
"dev": ["nose", "tox"],
|
||||
"docs": ["sphinx", "sphinx-argparse"],
|
||||
},
|
||||
description='PDF parser and analyzer',
|
||||
long_description=package.__doc__,
|
||||
license='MIT/X',
|
||||
|
|
|
@ -0,0 +1,7 @@
|
|||
import os
|
||||
|
||||
|
||||
def absolute_sample_path(relative_sample_path):
|
||||
sample_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../samples'))
|
||||
sample_file = os.path.join(sample_dir, relative_sample_path)
|
||||
return sample_file
|
|
@ -1,5 +1,5 @@
|
|||
"""
|
||||
Tests based on the Adobe Glyph List Specification (https://github.com/adobe-type-tools/agl-specification#2-the-mapping)
|
||||
"""Tests based on the Adobe Glyph List Specification
|
||||
See: https://github.com/adobe-type-tools/agl-specification#2-the-mapping
|
||||
|
||||
While not in the specification, lowercase unicode often occurs in pdf's. Therefore lowercase unittest variants are
|
||||
added.
|
||||
|
|
|
@ -0,0 +1,38 @@
|
|||
import unittest
|
||||
|
||||
from helpers import absolute_sample_path
|
||||
from pdfminer.high_level import extract_text
|
||||
|
||||
|
||||
def run(sample_path):
|
||||
absolute_path = absolute_sample_path(sample_path)
|
||||
s = extract_text(absolute_path)
|
||||
return s
|
||||
|
||||
|
||||
test_strings = {
|
||||
"simple1.pdf": "Hello \n\nWorld\n\nWorld\n\nHello \n\nH e l l o \n\nH e l l o \n\nW o r l d\n\nW o r l d\n\n\f",
|
||||
"simple2.pdf": "\f",
|
||||
"simple3.pdf": "HelloHello\n\nWorld\n\nWorld\n\n\f",
|
||||
}
|
||||
|
||||
|
||||
class TestExtractText(unittest.TestCase):
|
||||
def test_simple1(self):
|
||||
test_file = "simple1.pdf"
|
||||
s = run(test_file)
|
||||
self.assertEqual(s, test_strings[test_file])
|
||||
|
||||
def test_simple2(self):
|
||||
test_file = "simple2.pdf"
|
||||
s = run(test_file)
|
||||
self.assertEqual(s, test_strings[test_file])
|
||||
|
||||
def test_simple3(self):
|
||||
test_file = "simple3.pdf"
|
||||
s = run(test_file)
|
||||
self.assertEqual(s, test_strings[test_file])
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
|
@ -0,0 +1,16 @@
|
|||
from nose.tools import raises
|
||||
|
||||
from helpers import absolute_sample_path
|
||||
from pdfminer.pdfdocument import PDFDocument
|
||||
from pdfminer.pdfparser import PDFParser
|
||||
from pdfminer.pdftypes import PDFObjectNotFound
|
||||
|
||||
|
||||
class TestPdfDocument(object):
|
||||
|
||||
@raises(PDFObjectNotFound)
|
||||
def test_get_zero_objid_raises_pdfobjectnotfound(self):
|
||||
with open(absolute_sample_path('simple1.pdf'), 'rb') as in_file:
|
||||
parser = PDFParser(in_file)
|
||||
doc = PDFDocument(parser)
|
||||
doc.getobj(0)
|
|
@ -2,12 +2,14 @@
|
|||
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import nose, logging, os
|
||||
import nose
|
||||
|
||||
from pdfminer.cmapdb import IdentityCMap, CMap, IdentityCMapByte
|
||||
from pdfminer.pdffont import PDFCIDFont
|
||||
from pdfminer.pdftypes import PDFStream
|
||||
from pdfminer.psparser import PSLiteral
|
||||
|
||||
|
||||
class TestPDFEncoding():
|
||||
|
||||
def test_cmapname_onebyteidentityV(self):
|
||||
|
|
|
@ -1,19 +1,9 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
from nose.tools import assert_equal, assert_true, assert_false
|
||||
from nose import SkipTest
|
||||
import nose
|
||||
|
||||
import logging
|
||||
from nose.tools import assert_equal
|
||||
|
||||
from pdfminer.ccitt import *
|
||||
|
||||
## Test cases
|
||||
##
|
||||
class TestCCITTG4Parser():
|
||||
|
||||
class TestCCITTG4Parser():
|
||||
def get_parser(self, bits):
|
||||
parser = CCITTG4Parser(len(bits))
|
||||
parser._curline = [int(c) for c in bits]
|
||||
|
@ -163,6 +153,3 @@ class TestCCITTG4Parser():
|
|||
parser._do_vertical(1)
|
||||
assert_equal(parser._get_bits(), '00000001')
|
||||
return
|
||||
|
||||
if __name__ == '__main__':
|
||||
nose.runmodule()
|
|
@ -1,24 +1,29 @@
|
|||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""Test of various compression/encoding modules (previously in doctests)
|
||||
"""
|
||||
import binascii
|
||||
|
||||
from nose.tools import assert_equal
|
||||
from nose import SkipTest
|
||||
import nose
|
||||
|
||||
#test of various compression/encoding modules (previously in doctests):
|
||||
from pdfminer.ascii85 import *
|
||||
from pdfminer.arcfour import *
|
||||
from pdfminer.ascii85 import *
|
||||
from pdfminer.lzw import *
|
||||
from pdfminer.runlength import *
|
||||
from pdfminer.rijndael import *
|
||||
from pdfminer.runlength import *
|
||||
|
||||
|
||||
def hex(b):
|
||||
"""encode('hex')"""
|
||||
return binascii.hexlify(b)
|
||||
|
||||
|
||||
def dehex(b):
|
||||
"""decode('hex')"""
|
||||
return binascii.unhexlify(b)
|
||||
|
||||
import binascii
|
||||
def hex(b): return binascii.hexlify(b) #encode('hex')
|
||||
def dehex(b): return binascii.unhexlify(b) #decode('hex')
|
||||
|
||||
class TestAscii85():
|
||||
def test_ascii85decode(self):
|
||||
#The sample string is taken from: http://en.wikipedia.org/w/index.php?title=Ascii85
|
||||
"""The sample string is taken from: http://en.wikipedia.org/w/index.php?title=Ascii85"""
|
||||
assert_equal(ascii85decode(b'9jqo^BlbD-BleB1DJ+*+F(f,q'), b'Man is distinguished')
|
||||
assert_equal(ascii85decode(b'E,9)oF*2M7/c~>'), b'pleasure.')
|
||||
|
||||
|
@ -27,26 +32,26 @@ class TestAscii85():
|
|||
assert_equal(asciihexdecode(b'61 62 2e6364 657>'), b'ab.cdep')
|
||||
assert_equal(asciihexdecode(b'7>'), b'p')
|
||||
|
||||
|
||||
class TestArcfour():
|
||||
def test(self):
|
||||
|
||||
assert_equal(hex(Arcfour(b'Key').process(b'Plaintext')), b'bbf316e8d940af0ad3')
|
||||
assert_equal(hex(Arcfour(b'Wiki').process(b'pedia')), b'1021bf0420')
|
||||
assert_equal(hex(Arcfour(b'Secret').process(b'Attack at dawn')), b'45a01f645fc35b383552544b9bf5')
|
||||
|
||||
|
||||
class TestLzw():
|
||||
def test_lzwdecode(self):
|
||||
assert_equal(lzwdecode(b'\x80\x0b\x60\x50\x22\x0c\x0c\x85\x01'), b'\x2d\x2d\x2d\x2d\x2d\x41\x2d\x2d\x2d\x42')
|
||||
|
||||
|
||||
class TestRunlength():
|
||||
def test_rldecode(self):
|
||||
assert_equal(rldecode(b'\x05123456\xfa7\x04abcde\x80junk'), b'1234567777777abcde')
|
||||
|
||||
|
||||
class TestRijndaelEncryptor():
|
||||
def test_RijndaelEncryptor(self):
|
||||
key = dehex(b'00010203050607080a0b0c0d0f101112')
|
||||
plaintext = dehex(b'506812a45f08c889b97f5980038b8359')
|
||||
assert_equal(hex(RijndaelEncryptor(key, 128).encrypt(plaintext)), b'd8f532538289ef7d06b506a4fd5be9c9')
|
||||
|
||||
if __name__ == '__main__':
|
||||
nose.runmodule()
|
|
@ -1,18 +1,14 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
from nose.tools import assert_equal, assert_true, assert_false
|
||||
from nose import SkipTest
|
||||
import nose
|
||||
|
||||
import logging
|
||||
|
||||
from nose.tools import assert_equal
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from pdfminer.psparser import *
|
||||
|
||||
## Simplistic Test cases
|
||||
##
|
||||
|
||||
class TestPSBaseParser:
|
||||
"""Simplistic Test cases"""
|
||||
|
||||
TESTDATA = br'''%!PS
|
||||
begin end
|
||||
|
@ -66,6 +62,7 @@ func/a/b{(c)do*}def
|
|||
class MyParser(PSBaseParser):
|
||||
def flush(self):
|
||||
self.add_results(*self.popall())
|
||||
|
||||
parser = MyParser(BytesIO(s))
|
||||
r = []
|
||||
try:
|
||||
|
@ -81,6 +78,7 @@ func/a/b{(c)do*}def
|
|||
class MyParser(PSStackParser):
|
||||
def flush(self):
|
||||
self.add_results(*self.popall())
|
||||
|
||||
parser = MyParser(BytesIO(s))
|
||||
r = []
|
||||
try:
|
||||
|
@ -92,17 +90,12 @@ func/a/b{(c)do*}def
|
|||
|
||||
def test_1(self):
|
||||
tokens = self.get_tokens(self.TESTDATA)
|
||||
logging.info(tokens)
|
||||
logger.info(tokens)
|
||||
assert_equal(tokens, self.TOKENS)
|
||||
return
|
||||
|
||||
def test_2(self):
|
||||
objs = self.get_objects(self.TESTDATA)
|
||||
logging.info(objs)
|
||||
logger.info(objs)
|
||||
assert_equal(objs, self.OBJS)
|
||||
return
|
||||
|
||||
if __name__ == '__main__':
|
||||
#import logging,sys,os,six
|
||||
#logging.basicConfig(level=logging.DEBUG, filename='%s_%d.%d.log'%(os.path.basename(__file__),sys.version_info[0],sys.version_info[1]))
|
||||
nose.runmodule()
|
|
@ -1,53 +1,37 @@
|
|||
#!/usr/bin/env python
|
||||
from tempfile import NamedTemporaryFile
|
||||
|
||||
# -*- coding: utf-8 -*-
|
||||
import six
|
||||
|
||||
import nose, logging, os
|
||||
|
||||
if six.PY3:
|
||||
from helpers import absolute_sample_path
|
||||
from tools import dumppdf
|
||||
elif six.PY2:
|
||||
import os, sys
|
||||
sys.path.append(os.path.abspath(os.path.curdir))
|
||||
import tools.dumppdf as dumppdf
|
||||
|
||||
path=os.path.dirname(os.path.abspath(__file__))+'/'
|
||||
|
||||
def run(datapath,filename,options=None):
|
||||
i=path+datapath+filename+'.pdf'
|
||||
o=path+filename+'.xml'
|
||||
def run(filename, options=None):
|
||||
absolute_path = absolute_sample_path(filename)
|
||||
with NamedTemporaryFile() as output_file:
|
||||
if options:
|
||||
s='dumppdf -o%s %s %s'%(o,options,i)
|
||||
s = 'dumppdf -o %s %s %s' % (output_file.name, options, absolute_path)
|
||||
else:
|
||||
s='dumppdf -o%s %s'%(o,i)
|
||||
dumppdf.main(s.split(' '))
|
||||
s = 'dumppdf -o %s %s' % (output_file.name, absolute_path)
|
||||
dumppdf.main(s.split(' ')[1:])
|
||||
|
||||
|
||||
class TestDumpPDF():
|
||||
|
||||
|
||||
def test_1(self):
|
||||
run('../samples/','jo','-t -a')
|
||||
run('../samples/','simple1','-t -a')
|
||||
run('../samples/','simple2','-t -a')
|
||||
run('../samples/','simple3','-t -a')
|
||||
run('jo.pdf', '-t -a')
|
||||
run('simple1.pdf', '-t -a')
|
||||
run('simple2.pdf', '-t -a')
|
||||
run('simple3.pdf', '-t -a')
|
||||
|
||||
def test_2(self):
|
||||
run('../samples/nonfree/','dmca','-t -a')
|
||||
run('nonfree/dmca.pdf', '-t -a')
|
||||
|
||||
def test_3(self):
|
||||
run('../samples/nonfree/','f1040nr')
|
||||
run('nonfree/f1040nr.pdf')
|
||||
|
||||
def test_4(self):
|
||||
run('../samples/nonfree/','i1040nr')
|
||||
run('nonfree/i1040nr.pdf')
|
||||
|
||||
def test_5(self):
|
||||
run('../samples/nonfree/','kampo','-t -a')
|
||||
run('nonfree/kampo.pdf', '-t -a')
|
||||
|
||||
def test_6(self):
|
||||
run('../samples/nonfree/','naacl06-shinyama','-t -a')
|
||||
|
||||
if __name__ == '__main__':
|
||||
#import logging,sys,os,six
|
||||
#logging.basicConfig(level=logging.DEBUG, filename='%s_%d.%d.log'%(os.path.basename(__file__),sys.version_info[0],sys.version_info[1]))
|
||||
nose.runmodule()
|
||||
run('nonfree/naacl06-shinyama.pdf', '-t -a')
|
||||
|
|
|
@ -2,72 +2,71 @@ import os
|
|||
from shutil import rmtree
|
||||
from tempfile import NamedTemporaryFile, mkdtemp
|
||||
|
||||
import nose
|
||||
|
||||
import tools.pdf2txt as pdf2txt
|
||||
from helpers import absolute_sample_path
|
||||
|
||||
|
||||
def full_path(relative_path_to_this_file):
|
||||
this_file_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
abspath = os.path.abspath(os.path.join(this_file_dir, relative_path_to_this_file))
|
||||
return abspath
|
||||
|
||||
|
||||
def run(datapath, filename, options=None):
|
||||
i = full_path(datapath + filename + '.pdf')
|
||||
o = full_path(filename + '.txt')
|
||||
def run(sample_path, options=None):
|
||||
absolute_path = absolute_sample_path(sample_path)
|
||||
with NamedTemporaryFile() as output_file:
|
||||
if options:
|
||||
s = 'pdf2txt -o%s %s %s' % (o, options, i)
|
||||
s = 'pdf2txt -o %s %s %s' % (output_file.name, options, absolute_path)
|
||||
else:
|
||||
s = 'pdf2txt -o%s %s' % (o, i)
|
||||
s = 'pdf2txt -o %s %s' % (output_file.name, absolute_path)
|
||||
pdf2txt.main(s.split(' ')[1:])
|
||||
|
||||
|
||||
class TestDumpPDF():
|
||||
|
||||
def test_1(self):
|
||||
run('../samples/', 'jo')
|
||||
run('../samples/', 'simple1')
|
||||
run('../samples/', 'simple2')
|
||||
run('../samples/', 'simple3')
|
||||
run('../samples/','sampleOneByteIdentityEncode')
|
||||
def test_jo(self):
|
||||
run('jo.pdf')
|
||||
|
||||
def test_2(self):
|
||||
run('../samples/nonfree/', 'dmca')
|
||||
def test_simple1(self):
|
||||
run('simple1.pdf')
|
||||
|
||||
def test_3(self):
|
||||
run('../samples/nonfree/', 'f1040nr')
|
||||
def test_simple2(self):
|
||||
run('simple2.pdf')
|
||||
|
||||
def test_4(self):
|
||||
run('../samples/nonfree/', 'i1040nr')
|
||||
def test_simple3(self):
|
||||
run('simple3.pdf')
|
||||
|
||||
def test_5(self):
|
||||
run('../samples/nonfree/', 'kampo')
|
||||
def test_sample_one_byte_identity_encode(self):
|
||||
run('sampleOneByteIdentityEncode.pdf')
|
||||
|
||||
def test_6(self):
|
||||
run('../samples/nonfree/', 'naacl06-shinyama')
|
||||
def test_nonfree_175(self):
|
||||
"""Regression test for https://github.com/pdfminer/pdfminer.six/issues/65"""
|
||||
run('nonfree/175.pdf')
|
||||
|
||||
# this test works on Windows but on Linux & Travis-CI it says
|
||||
# PDFSyntaxError: No /Root object! - Is this really a PDF?
|
||||
# TODO: Find why
|
||||
"""
|
||||
def test_7(self):
|
||||
run('../samples/contrib/','stamp-no')
|
||||
"""
|
||||
def test_nonfree_dmca(self):
|
||||
run('nonfree/dmca.pdf')
|
||||
|
||||
def test_8(self):
|
||||
run('../samples/contrib/', '2b', '-A -t xml')
|
||||
def test_nonfree_f1040nr(self):
|
||||
run('nonfree/f1040nr.pdf')
|
||||
|
||||
def test_9(self):
|
||||
run('../samples/nonfree/', '175') # https://github.com/pdfminer/pdfminer.six/issues/65
|
||||
def test_nonfree_i1040nr(self):
|
||||
run('nonfree/i1040nr.pdf')
|
||||
|
||||
def test_10(self):
|
||||
run('../samples/scancode/', 'patchelf') # https://github.com/euske/pdfminer/issues/96
|
||||
def test_nonfree_kampo(self):
|
||||
run('nonfree/kampo.pdf')
|
||||
|
||||
def test_nonfree_naacl06_shinyama(self):
|
||||
run('nonfree/naacl06-shinyama.pdf')
|
||||
|
||||
def test_nlp2004slides(self):
|
||||
run('nonfree/nlp2004slides.pdf')
|
||||
|
||||
def test_contrib_2b(self):
|
||||
run('contrib/2b.pdf', '-A -t xml')
|
||||
|
||||
def test_scancode_patchelf(self):
|
||||
"""Regression test for # https://github.com/euske/pdfminer/issues/96"""
|
||||
run('scancode/patchelf.pdf')
|
||||
|
||||
|
||||
class TestDumpImages(object):
|
||||
|
||||
def extract_images(self, input_file):
|
||||
@staticmethod
|
||||
def extract_images(input_file):
|
||||
output_dir = mkdtemp()
|
||||
with NamedTemporaryFile() as output_file:
|
||||
commands = ['-o', output_file.name, '--output-dir', output_dir, input_file]
|
||||
|
@ -81,13 +80,25 @@ class TestDumpImages(object):
|
|||
|
||||
Regression test for: https://github.com/pdfminer/pdfminer.six/issues/131
|
||||
"""
|
||||
image_files = self.extract_images(full_path('../samples/nonfree/dmca.pdf'))
|
||||
image_files = self.extract_images(absolute_sample_path('../samples/nonfree/dmca.pdf'))
|
||||
assert image_files[0].endswith('bmp')
|
||||
|
||||
def test_nonfree_175(self):
|
||||
"""Extract images of pdf containing jpg images"""
|
||||
self.extract_images(full_path('../samples/nonfree/175.pdf'))
|
||||
self.extract_images(absolute_sample_path('../samples/nonfree/175.pdf'))
|
||||
|
||||
def test_jbig2_image_export(self):
|
||||
"""Extract images of pdf containing jbig2 images
|
||||
|
||||
if __name__ == '__main__':
|
||||
nose.runmodule()
|
||||
Feature test for: https://github.com/pdfminer/pdfminer.six/pull/46
|
||||
"""
|
||||
image_files = self.extract_images(absolute_sample_path('../samples/contrib/pdf-with-jbig2.pdf'))
|
||||
assert image_files[0].endswith('.jb2')
|
||||
|
||||
def test_contrib_matplotlib(self):
|
||||
"""Test a pdf with Type3 font"""
|
||||
run('contrib/matplotlib.pdf')
|
||||
|
||||
def test_nonfree_cmp_itext_logo(self):
|
||||
"""Test a pdf with Type3 font"""
|
||||
run('nonfree/cmp_itext_logo.pdf')
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
from nose.tools import assert_equal
|
||||
|
||||
from pdfminer.layout import LTComponent
|
||||
from pdfminer.utils import make_compat_str, Plane
|
||||
from pdfminer.utils import Plane
|
||||
|
||||
|
||||
class TestPlane(object):
|
||||
|
|
192
tools/dumppdf.py
192
tools/dumppdf.py
|
@ -1,32 +1,31 @@
|
|||
#!/usr/bin/env python
|
||||
"""Extract pdf structure in XML format"""
|
||||
import logging
|
||||
import os.path
|
||||
import re
|
||||
import sys
|
||||
from argparse import ArgumentParser
|
||||
|
||||
import six
|
||||
|
||||
#
|
||||
# dumppdf.py - dump pdf contents in XML format.
|
||||
#
|
||||
# usage: dumppdf.py [options] [files ...]
|
||||
# options:
|
||||
# -i objid : object id
|
||||
#
|
||||
import sys, os.path, re, logging
|
||||
from pdfminer.psparser import PSKeyword, PSLiteral, LIT
|
||||
from pdfminer.pdfparser import PDFParser
|
||||
from pdfminer.pdfdocument import PDFDocument, PDFNoOutlines
|
||||
from pdfminer.pdfpage import PDFPage
|
||||
from pdfminer.pdfparser import PDFParser
|
||||
from pdfminer.pdftypes import PDFObjectNotFound, PDFValueError
|
||||
from pdfminer.pdftypes import PDFStream, PDFObjRef, resolve1, stream_value
|
||||
from pdfminer.pdfpage import PDFPage
|
||||
from pdfminer.psparser import PSKeyword, PSLiteral, LIT
|
||||
from pdfminer.utils import isnumber
|
||||
|
||||
logging.basicConfig()
|
||||
|
||||
ESC_PAT = re.compile(r'[\000-\037&<>()"\042\047\134\177-\377]')
|
||||
|
||||
|
||||
def e(s):
|
||||
if six.PY3 and isinstance(s, six.binary_type):
|
||||
s = str(s, 'latin-1')
|
||||
return ESC_PAT.sub(lambda m: '&#%d;' % ord(m.group(0)), s)
|
||||
|
||||
import six # Python 2+3 compatibility
|
||||
|
||||
|
||||
# dumpxml
|
||||
def dumpxml(out, obj, codec=None):
|
||||
if obj is None:
|
||||
out.write('<null />')
|
||||
|
@ -87,7 +86,7 @@ def dumpxml(out, obj, codec=None):
|
|||
|
||||
raise TypeError(obj)
|
||||
|
||||
# dumptrailers
|
||||
|
||||
def dumptrailers(out, doc):
|
||||
for xref in doc.xrefs:
|
||||
out.write('<trailer>\n')
|
||||
|
@ -95,7 +94,7 @@ def dumptrailers(out, doc):
|
|||
out.write('\n</trailer>\n\n')
|
||||
return
|
||||
|
||||
# dumpallobjs
|
||||
|
||||
def dumpallobjs(out, doc, codec=None):
|
||||
visited = set()
|
||||
out.write('<pdf>')
|
||||
|
@ -110,12 +109,12 @@ def dumpallobjs(out, doc, codec=None):
|
|||
dumpxml(out, obj, codec=codec)
|
||||
out.write('\n</object>\n\n')
|
||||
except PDFObjectNotFound as e:
|
||||
print >>sys.stderr, 'not found: %r' % e
|
||||
print('not found: %r' % e)
|
||||
dumptrailers(out, doc)
|
||||
out.write('</pdf>')
|
||||
return
|
||||
|
||||
# dumpoutline
|
||||
|
||||
def dumpoutline(outfp, fname, objids, pagenos, password='',
|
||||
dumpall=False, codec=None, extractdir=None):
|
||||
fp = open(fname, 'rb')
|
||||
|
@ -123,6 +122,7 @@ def dumpoutline(outfp, fname, objids, pagenos, password='',
|
|||
doc = PDFDocument(parser, password)
|
||||
pages = dict((page.pageid, pageno) for (pageno, page)
|
||||
in enumerate(PDFPage.create_pages(doc), 1))
|
||||
|
||||
def resolve_dest(dest):
|
||||
if isinstance(dest, str):
|
||||
dest = resolve1(doc.get_dest(dest))
|
||||
|
@ -133,6 +133,7 @@ def dumpoutline(outfp, fname, objids, pagenos, password='',
|
|||
if isinstance(dest, PDFObjRef):
|
||||
dest = dest.resolve()
|
||||
return dest
|
||||
|
||||
try:
|
||||
outlines = doc.get_outlines()
|
||||
outfp.write('<outlines>\n')
|
||||
|
@ -145,7 +146,8 @@ def dumpoutline(outfp, fname, objids, pagenos, password='',
|
|||
action = a
|
||||
if isinstance(action, dict):
|
||||
subtype = action.get('S')
|
||||
if subtype and repr(subtype) == '/\'GoTo\'' and action.get('D'):
|
||||
if subtype and repr(subtype) == '/\'GoTo\'' and action.get(
|
||||
'D'):
|
||||
dest = resolve_dest(action['D'])
|
||||
pageno = pages[dest[0].objid]
|
||||
s = e(title).encode('utf-8', 'xmlcharrefreplace')
|
||||
|
@ -164,9 +166,11 @@ def dumpoutline(outfp, fname, objids, pagenos, password='',
|
|||
fp.close()
|
||||
return
|
||||
|
||||
# extractembedded
|
||||
|
||||
LITERAL_FILESPEC = LIT('Filespec')
|
||||
LITERAL_EMBEDDEDFILE = LIT('EmbeddedFile')
|
||||
|
||||
|
||||
def extractembedded(outfp, fname, objids, pagenos, password='',
|
||||
dumpall=False, codec=None, extractdir=None):
|
||||
def extract1(obj):
|
||||
|
@ -184,8 +188,8 @@ def extractembedded(outfp, fname, objids, pagenos, password='',
|
|||
path = os.path.join(extractdir, filename)
|
||||
if os.path.exists(path):
|
||||
raise IOError('file exists: %r' % path)
|
||||
print >>sys.stderr, 'extracting: %r' % path
|
||||
out = file(path, 'wb')
|
||||
print('extracting: %r' % path)
|
||||
out = open(path, 'wb')
|
||||
out.write(fileobj.get_data())
|
||||
out.close()
|
||||
return
|
||||
|
@ -201,7 +205,7 @@ def extractembedded(outfp, fname, objids, pagenos, password='',
|
|||
fp.close()
|
||||
return
|
||||
|
||||
# dumppdf
|
||||
|
||||
def dumppdf(outfp, fname, objids, pagenos, password='',
|
||||
dumpall=False, codec=None, extractdir=None):
|
||||
fp = open(fname, 'rb')
|
||||
|
@ -230,46 +234,114 @@ def dumppdf(outfp, fname, objids, pagenos, password='',
|
|||
return
|
||||
|
||||
|
||||
# main
|
||||
def main(argv):
|
||||
import getopt
|
||||
def usage():
|
||||
print ('usage: %s [-d] [-a] [-p pageid] [-P password] [-r|-b|-t] [-T] [-E directory] [-i objid] file ...' % argv[0])
|
||||
return 100
|
||||
try:
|
||||
(opts, args) = getopt.getopt(argv[1:], 'dap:P:rbtTE:i:o:')
|
||||
except getopt.GetoptError:
|
||||
return usage()
|
||||
if not args: return usage()
|
||||
objids = []
|
||||
pagenos = set()
|
||||
codec = None
|
||||
password = ''
|
||||
dumpall = False
|
||||
proc = dumppdf
|
||||
outfp = sys.stdout
|
||||
extractdir = None
|
||||
for (k, v) in opts:
|
||||
if k == '-d': logging.getLogger().setLevel(logging.DEBUG)
|
||||
elif k == '-o': outfp = open(v, 'w')
|
||||
elif k == '-i': objids.extend( int(x) for x in v.split(',') )
|
||||
elif k == '-p': pagenos.update( int(x)-1 for x in v.split(',') )
|
||||
elif k == '-P': password = v
|
||||
elif k == '-a': dumpall = True
|
||||
elif k == '-r': codec = 'raw'
|
||||
elif k == '-b': codec = 'binary'
|
||||
elif k == '-t': codec = 'text'
|
||||
elif k == '-T': proc = dumpoutline
|
||||
elif k == '-E':
|
||||
extractdir = v
|
||||
proc = extractembedded
|
||||
def create_parser():
|
||||
parser = ArgumentParser(description=__doc__, add_help=True)
|
||||
parser.add_argument('files', type=str, default=None, nargs='+',
|
||||
help='One or more paths to PDF files.')
|
||||
|
||||
parser.add_argument(
|
||||
'--debug', '-d', default=False, action='store_true',
|
||||
help='Use debug logging level.')
|
||||
procedure_parser = parser.add_mutually_exclusive_group()
|
||||
procedure_parser.add_argument(
|
||||
'--extract-toc', '-T', default=False, action='store_true',
|
||||
help='Extract structure of outline')
|
||||
procedure_parser.add_argument(
|
||||
'--extract-embedded', '-E', type=str,
|
||||
help='Extract embedded files')
|
||||
|
||||
parse_params = parser.add_argument_group(
|
||||
'Parser', description='Used during PDF parsing')
|
||||
parse_params.add_argument(
|
||||
'--page-numbers', type=int, default=None, nargs='+',
|
||||
help='A space-seperated list of page numbers to parse.')
|
||||
parse_params.add_argument(
|
||||
'--pagenos', '-p', type=str,
|
||||
help='A comma-separated list of page numbers to parse. Included for '
|
||||
'legacy applications, use --page-numbers for more idiomatic '
|
||||
'argument entry.')
|
||||
parse_params.add_argument(
|
||||
'--objects', '-i', type=str,
|
||||
help='Comma separated list of object numbers to extract')
|
||||
parse_params.add_argument(
|
||||
'--all', '-a', default=False, action='store_true',
|
||||
help='If the structure of all objects should be extracted')
|
||||
parse_params.add_argument(
|
||||
'--password', '-P', type=str, default='',
|
||||
help='The password to use for decrypting PDF file.')
|
||||
|
||||
output_params = parser.add_argument_group(
|
||||
'Output', description='Used during output generation.')
|
||||
output_params.add_argument(
|
||||
'--outfile', '-o', type=str, default='-',
|
||||
help='Path to file where output is written. Or "-" (default) to '
|
||||
'write to stdout.')
|
||||
codec_parser = output_params.add_mutually_exclusive_group()
|
||||
codec_parser.add_argument(
|
||||
'--raw-stream', '-r', default=False, action='store_true',
|
||||
help='Write stream objects without encoding')
|
||||
codec_parser.add_argument(
|
||||
'--binary-stream', '-b', default=False, action='store_true',
|
||||
help='Write stream objects with binary encoding')
|
||||
codec_parser.add_argument(
|
||||
'--text-stream', '-t', default=False, action='store_true',
|
||||
help='Write stream objects as plain text')
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
def main(argv=None):
|
||||
parser = create_parser()
|
||||
args = parser.parse_args(args=argv)
|
||||
|
||||
if args.debug:
|
||||
logging.getLogger().setLevel(logging.DEBUG)
|
||||
|
||||
if args.outfile == '-':
|
||||
outfp = sys.stdout
|
||||
else:
|
||||
outfp = open(args.outfile, 'w')
|
||||
|
||||
if args.objects:
|
||||
objids = [int(x) for x in args.objects.split(',')]
|
||||
else:
|
||||
objids = []
|
||||
|
||||
if args.page_numbers:
|
||||
pagenos = {x - 1 for x in args.page_numbers}
|
||||
elif args.pagenos:
|
||||
pagenos = {int(x) - 1 for x in args.pagenos.split(',')}
|
||||
else:
|
||||
pagenos = set()
|
||||
|
||||
password = args.password
|
||||
if six.PY2 and sys.stdin.encoding:
|
||||
password = password.decode(sys.stdin.encoding)
|
||||
|
||||
for fname in args:
|
||||
if args.raw_stream:
|
||||
codec = 'raw'
|
||||
elif args.binary_stream:
|
||||
codec = 'binary'
|
||||
elif args.text_stream:
|
||||
codec = 'text'
|
||||
else:
|
||||
codec = None
|
||||
|
||||
if args.extract_toc:
|
||||
extractdir = None
|
||||
proc = dumpoutline
|
||||
elif args.extract_embedded:
|
||||
extractdir = args.extract_embedded
|
||||
proc = extractembedded
|
||||
else:
|
||||
extractdir = None
|
||||
proc = dumppdf
|
||||
|
||||
for fname in args.files:
|
||||
proc(outfp, fname, objids, pagenos, password=password,
|
||||
dumpall=dumpall, codec=codec, extractdir=extractdir)
|
||||
dumpall=args.all, codec=codec, extractdir=extractdir)
|
||||
outfp.close()
|
||||
|
||||
if __name__ == '__main__': sys.exit(main(sys.argv))
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
|
|
|
@ -1,215 +0,0 @@
|
|||
#!/usr/bin/env python -O
|
||||
#
|
||||
# pdf2html.cgi - Gateway script for converting PDF into HTML.
|
||||
#
|
||||
# Security consideration for public access:
|
||||
#
|
||||
# Limit the process size and/or maximum cpu time.
|
||||
# The process should be chrooted.
|
||||
# The user should be imposed quota.
|
||||
#
|
||||
# How to Setup:
|
||||
# $ mkdir $CGIDIR
|
||||
# $ mkdir $CGIDIR/var
|
||||
# $ python setup.py install_lib --install-dir=$CGIDIR
|
||||
# $ cp pdfminer/tools/pdf2html.cgi $CGIDIR
|
||||
#
|
||||
|
||||
import sys, os, os.path, re, time
|
||||
import cgi, logging, traceback, random
|
||||
# comment out at this at runtime.
|
||||
#import cgitb; cgitb.enable()
|
||||
import pdfminer
|
||||
from pdfminer.pdfdocument import PDFDocument
|
||||
from pdfminer.pdfpage import PDFPage
|
||||
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
|
||||
from pdfminer.converter import HTMLConverter, TextConverter
|
||||
from pdfminer.layout import LAParams
|
||||
|
||||
import six #Python 2+3 compatibility
|
||||
|
||||
# quote HTML metacharacters
|
||||
def q(x):
|
||||
return x.replace('&','&').replace('>','>').replace('<','<').replace('"','"')
|
||||
|
||||
# encode parameters as a URL
|
||||
Q = re.compile(r'[^a-zA-Z0-9_.-=]')
|
||||
def url(base, **kw):
|
||||
r = []
|
||||
for (k,v) in six.iteritems(kw):
|
||||
v = Q.sub(lambda m: '%%%02X' % ord(m.group(0)), encoder(q(v), 'replace')[0])
|
||||
r.append('%s=%s' % (k, v))
|
||||
return base+'&'.join(r)
|
||||
|
||||
|
||||
## convert
|
||||
##
|
||||
class FileSizeExceeded(ValueError): pass
|
||||
def convert(infp, outfp, path, codec='utf-8',
|
||||
maxpages=0, maxfilesize=0, pagenos=None,
|
||||
html=True):
|
||||
# save the input file.
|
||||
src = open(path, 'wb')
|
||||
nbytes = 0
|
||||
while 1:
|
||||
data = infp.read(4096)
|
||||
nbytes += len(data)
|
||||
if maxfilesize and maxfilesize < nbytes:
|
||||
raise FileSizeExceeded(maxfilesize)
|
||||
if not data: break
|
||||
src.write(data)
|
||||
src.close()
|
||||
infp.close()
|
||||
# perform conversion and
|
||||
# send the results over the network.
|
||||
rsrcmgr = PDFResourceManager()
|
||||
laparams = LAParams()
|
||||
if html:
|
||||
device = HTMLConverter(rsrcmgr, outfp, codec=codec, laparams=laparams,
|
||||
layoutmode='exact')
|
||||
else:
|
||||
device = TextConverter(rsrcmgr, outfp, codec=codec, laparams=laparams)
|
||||
fp = open(path, 'rb')
|
||||
interpreter = PDFPageInterpreter(rsrcmgr, device)
|
||||
for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages):
|
||||
interpreter.process_page(page)
|
||||
fp.close()
|
||||
device.close()
|
||||
return
|
||||
|
||||
|
||||
## WebApp
|
||||
##
|
||||
class WebApp(object):
|
||||
|
||||
TITLE = 'pdf2html demo'
|
||||
MAXFILESIZE = 10000000 # set to zero if unlimited.
|
||||
MAXPAGES = 100 # set to zero if unlimited.
|
||||
|
||||
def __init__(self, infp=sys.stdin, outfp=sys.stdout, environ=os.environ,
|
||||
codec='utf-8', apppath='/'):
|
||||
self.infp = infp
|
||||
self.outfp = outfp
|
||||
self.environ = environ
|
||||
self.codec = codec
|
||||
self.apppath = apppath
|
||||
self.remote_addr = self.environ.get('REMOTE_ADDR')
|
||||
self.path_info = self.environ.get('PATH_INFO')
|
||||
self.method = self.environ.get('REQUEST_METHOD', 'GET').upper()
|
||||
self.server = self.environ.get('SERVER_SOFTWARE', '')
|
||||
self.tmpdir = self.environ.get('TEMP', './var/')
|
||||
self.content_type = 'text/html; charset=%s' % codec
|
||||
self.logger = logging.getLogger()
|
||||
return
|
||||
|
||||
def put(self, *args):
|
||||
for x in args:
|
||||
if isinstance(x, str):
|
||||
self.outfp.write(x)
|
||||
elif isinstance(x, unicode):
|
||||
self.outfp.write(x.encode(self.codec, 'xmlcharrefreplace'))
|
||||
return
|
||||
|
||||
def response_200(self):
|
||||
if self.server.startswith('cgi-httpd'):
|
||||
# required for cgi-httpd
|
||||
self.outfp.write('HTTP/1.0 200 OK\r\n')
|
||||
self.outfp.write('Content-type: %s\r\n' % self.content_type)
|
||||
self.outfp.write('Connection: close\r\n\r\n')
|
||||
return
|
||||
|
||||
def response_404(self):
|
||||
if self.server.startswith('cgi-httpd'):
|
||||
# required for cgi-httpd
|
||||
self.outfp.write('HTTP/1.0 404 Not Found\r\n')
|
||||
self.outfp.write('Content-type: text/html\r\n')
|
||||
self.outfp.write('Connection: close\r\n\r\n')
|
||||
self.outfp.write('<html><body>page does not exist</body></body>\n')
|
||||
return
|
||||
|
||||
def response_301(self, url):
|
||||
if self.server.startswith('cgi-httpd'):
|
||||
# required for cgi-httpd
|
||||
self.outfp.write('HTTP/1.0 301 Moved\r\n')
|
||||
self.outfp.write('Location: %s\r\n\r\n' % url)
|
||||
return
|
||||
|
||||
def coverpage(self):
|
||||
self.put(
|
||||
'<html><head><title>%s</title></head><body>\n' % q(self.TITLE),
|
||||
'<h1>%s</h1><hr>\n' % q(self.TITLE),
|
||||
'<form method="POST" action="%s" enctype="multipart/form-data">\n' % q(self.apppath),
|
||||
'<p>Upload PDF File: <input name="f" type="file" value="">\n',
|
||||
' Page numbers (comma-separated):\n',
|
||||
'<input name="p" type="text" size="10" value="">\n',
|
||||
'<p>(Text extraction is limited to maximum %d pages.\n' % self.MAXPAGES,
|
||||
'Maximum file size for input is %d bytes.)\n' % self.MAXFILESIZE,
|
||||
'<p><input type="submit" name="c" value="Convert to HTML">\n',
|
||||
'<input type="submit" name="c" value="Convert to TEXT">\n',
|
||||
'<input type="reset" value="Reset">\n',
|
||||
'</form><hr>\n',
|
||||
'<p>Powered by <a href="http://www.unixuser.org/~euske/python/pdfminer/">PDFMiner</a>-%s\n' % pdfminer.__version__,
|
||||
'</body></html>\n',
|
||||
)
|
||||
return
|
||||
|
||||
def setup(self):
|
||||
self.run = self.response_404
|
||||
status = 404
|
||||
if not os.path.isdir(self.tmpdir):
|
||||
self.logger.error('no tmpdir')
|
||||
status = 304
|
||||
elif self.path_info == self.apppath:
|
||||
self.run = self.convert
|
||||
status = 200
|
||||
return status
|
||||
|
||||
def convert(self):
|
||||
form = cgi.FieldStorage(fp=self.infp, environ=self.environ)
|
||||
if (self.method != 'POST' or
|
||||
'c' not in form or
|
||||
'f' not in form):
|
||||
self.response_200()
|
||||
self.coverpage()
|
||||
return
|
||||
item = form['f']
|
||||
if not (item.file and item.filename):
|
||||
self.response_200()
|
||||
self.coverpage()
|
||||
return
|
||||
cmd = form.getvalue('c')
|
||||
html = (cmd == 'Convert to HTML')
|
||||
pagenos = []
|
||||
if 'p' in form:
|
||||
for m in re.finditer(r'\d+', form.getvalue('p')):
|
||||
try:
|
||||
pagenos.append(int(m.group(0)))
|
||||
except ValueError:
|
||||
pass
|
||||
h = abs(hash((random.random(), self.remote_addr, item.filename)))
|
||||
tmppath = os.path.join(self.tmpdir, '%08x%08x.pdf' % (time.time(), h))
|
||||
self.logger.info('received: host=%s, name=%r, pagenos=%r, tmppath=%r' %
|
||||
(self.remote_addr, item.filename, pagenos, tmppath))
|
||||
try:
|
||||
if not html:
|
||||
self.content_type = 'text/plain; charset=%s' % self.codec
|
||||
self.response_200()
|
||||
try:
|
||||
convert(item.file, self.outfp, tmppath, pagenos=pagenos, codec=self.codec,
|
||||
maxpages=self.MAXPAGES, maxfilesize=self.MAXFILESIZE, html=html)
|
||||
except Exception as e:
|
||||
self.put('<p>Sorry, an error has occurred: %s' % q(repr(e)))
|
||||
self.logger.error('convert: %r: path=%r: %s' % (e, traceback.format_exc()))
|
||||
finally:
|
||||
try:
|
||||
os.remove(tmppath)
|
||||
except:
|
||||
pass
|
||||
return
|
||||
|
||||
|
||||
# main
|
||||
if __name__ == '__main__':
|
||||
app = WebApp()
|
||||
app.setup()
|
||||
sys.exit(app.run())
|
109
tools/pdf2txt.py
109
tools/pdf2txt.py
|
@ -1,29 +1,30 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
"""
|
||||
Converts PDF text content (though not images containing text) to plain text, html, xml or "tags".
|
||||
"""
|
||||
"""A command line tool for extracting text and images from PDF and output it to plain text, html, xml or tags."""
|
||||
import argparse
|
||||
import logging
|
||||
import six
|
||||
import sys
|
||||
import pdfminer.settings
|
||||
pdfminer.settings.STRICT = False
|
||||
import six
|
||||
|
||||
import pdfminer.high_level
|
||||
import pdfminer.layout
|
||||
from pdfminer.image import ImageWriter
|
||||
|
||||
logging.basicConfig()
|
||||
|
||||
|
||||
def extract_text(files=[], outfile='-',
|
||||
_py2_no_more_posargs=None, # Bloody Python2 needs a shim
|
||||
no_laparams=False, all_texts=None, detect_vertical=None, # LAParams
|
||||
word_margin=None, char_margin=None, line_margin=None, boxes_flow=None, # LAParams
|
||||
output_type='text', codec='utf-8', strip_control=False,
|
||||
maxpages=0, page_numbers=None, password="", scale=1.0, rotation=0,
|
||||
layoutmode='normal', output_dir=None, debug=False,
|
||||
disable_caching=False, **other):
|
||||
if _py2_no_more_posargs is not None:
|
||||
raise ValueError("Too many positional arguments passed.")
|
||||
disable_caching=False, **kwargs):
|
||||
if '_py2_no_more_posargs' in kwargs is not None:
|
||||
raise DeprecationWarning(
|
||||
'The `_py2_no_more_posargs will be removed on January, 2020. At '
|
||||
'that moment pdfminer.six will stop supporting Python 2. Please '
|
||||
'upgrade to Python 3. For more information see '
|
||||
'https://github.com/pdfminer/pdfminer .six/issues/194')
|
||||
|
||||
if not files:
|
||||
raise ValueError("Must provide files to work upon!")
|
||||
|
||||
|
@ -66,28 +67,68 @@ def extract_text(files=[], outfile='-',
|
|||
|
||||
def maketheparser():
|
||||
parser = argparse.ArgumentParser(description=__doc__, add_help=True)
|
||||
parser.add_argument("files", type=str, default=None, nargs="+", help="File to process.")
|
||||
parser.add_argument("-d", "--debug", default=False, action="store_true", help="Debug output.")
|
||||
parser.add_argument("-p", "--pagenos", type=str, help="Comma-separated list of page numbers to parse. Included for legacy applications, use --page-numbers for more idiomatic argument entry.")
|
||||
parser.add_argument("--page-numbers", type=int, default=None, nargs="+", help="Alternative to --pagenos with space-separated numbers; supercedes --pagenos where it is used.")
|
||||
parser.add_argument("-m", "--maxpages", type=int, default=0, help="Maximum pages to parse")
|
||||
parser.add_argument("-P", "--password", type=str, default="", help="Decryption password for PDF")
|
||||
parser.add_argument("-o", "--outfile", type=str, default="-", help="Output file (default \"-\" is stdout)")
|
||||
parser.add_argument("-t", "--output_type", type=str, default="text", help="Output type: text|html|xml|tag (default is text)")
|
||||
parser.add_argument("-c", "--codec", type=str, default="utf-8", help="Text encoding")
|
||||
parser.add_argument("-s", "--scale", type=float, default=1.0, help="Scale")
|
||||
parser.add_argument("-A", "--all-texts", default=None, action="store_true", help="LAParams all texts")
|
||||
parser.add_argument("-V", "--detect-vertical", default=None, action="store_true", help="LAParams detect vertical")
|
||||
parser.add_argument("-W", "--word-margin", type=float, default=None, help="LAParams word margin")
|
||||
parser.add_argument("-M", "--char-margin", type=float, default=None, help="LAParams char margin")
|
||||
parser.add_argument("-L", "--line-margin", type=float, default=None, help="LAParams line margin")
|
||||
parser.add_argument("-F", "--boxes-flow", type=float, default=None, help="LAParams boxes flow")
|
||||
parser.add_argument("-Y", "--layoutmode", default="normal", type=str, help="HTML Layout Mode")
|
||||
parser.add_argument("-n", "--no-laparams", default=False, action="store_true", help="Pass None as LAParams")
|
||||
parser.add_argument("-R", "--rotation", default=0, type=int, help="Rotation")
|
||||
parser.add_argument("-O", "--output-dir", default=None, help="Output directory for images")
|
||||
parser.add_argument("-C", "--disable-caching", default=False, action="store_true", help="Disable caching")
|
||||
parser.add_argument("-S", "--strip-control", default=False, action="store_true", help="Strip control in XML mode")
|
||||
parser.add_argument("files", type=str, default=None, nargs="+", help="One or more paths to PDF files.")
|
||||
|
||||
parser.add_argument("--debug", "-d", default=False, action="store_true",
|
||||
help="Use debug logging level.")
|
||||
parser.add_argument("--disable-caching", "-C", default=False, action="store_true",
|
||||
help="If caching or resources, such as fonts, should be disabled.")
|
||||
|
||||
parse_params = parser.add_argument_group('Parser', description='Used during PDF parsing')
|
||||
parse_params.add_argument("--page-numbers", type=int, default=None, nargs="+",
|
||||
help="A space-seperated list of page numbers to parse.")
|
||||
parse_params.add_argument("--pagenos", "-p", type=str,
|
||||
help="A comma-separated list of page numbers to parse. Included for legacy applications, "
|
||||
"use --page-numbers for more idiomatic argument entry.")
|
||||
parse_params.add_argument("--maxpages", "-m", type=int, default=0,
|
||||
help="The maximum number of pages to parse.")
|
||||
parse_params.add_argument("--password", "-P", type=str, default="",
|
||||
help="The password to use for decrypting PDF file.")
|
||||
parse_params.add_argument("--rotation", "-R", default=0, type=int,
|
||||
help="The number of degrees to rotate the PDF before other types of processing.")
|
||||
|
||||
la_params = parser.add_argument_group('Layout analysis', description='Used during layout analysis.')
|
||||
la_params.add_argument("--no-laparams", "-n", default=False, action="store_true",
|
||||
help="If layout analysis parameters should be ignored.")
|
||||
la_params.add_argument("--detect-vertical", "-V", default=False, action="store_true",
|
||||
help="If vertical text should be considered during layout analysis")
|
||||
la_params.add_argument("--char-margin", "-M", type=float, default=2.0,
|
||||
help="If two characters are closer together than this margin they are considered to be part "
|
||||
"of the same word. The margin is specified relative to the width of the character.")
|
||||
la_params.add_argument("--word-margin", "-W", type=float, default=0.1,
|
||||
help="If two words are are closer together than this margin they are considered to be part "
|
||||
"of the same line. A space is added in between for readability. The margin is "
|
||||
"specified relative to the width of the word.")
|
||||
la_params.add_argument("--line-margin", "-L", type=float, default=0.5,
|
||||
help="If two lines are are close together they are considered to be part of the same "
|
||||
"paragraph. The margin is specified relative to the height of a line.")
|
||||
la_params.add_argument("--boxes-flow", "-F", type=float, default=0.5,
|
||||
help="Specifies how much a horizontal and vertical position of a text matters when "
|
||||
"determining the order of lines. The value should be within the range of -1.0 (only "
|
||||
"horizontal position matters) to +1.0 (only vertical position matters).")
|
||||
la_params.add_argument("--all-texts", "-A", default=True, action="store_true",
|
||||
help="If layout analysis should be performed on text in figures.")
|
||||
|
||||
output_params = parser.add_argument_group('Output', description='Used during output generation.')
|
||||
output_params.add_argument("--outfile", "-o", type=str, default="-",
|
||||
help="Path to file where output is written. Or \"-\" (default) to write to stdout.")
|
||||
output_params.add_argument("--output_type", "-t", type=str, default="text",
|
||||
help="Type of output to generate {text,html,xml,tag}.")
|
||||
output_params.add_argument("--codec", "-c", type=str, default="utf-8",
|
||||
help="Text encoding to use in output file.")
|
||||
output_params.add_argument("--output-dir", "-O", default=None,
|
||||
help="The output directory to put extracted images in. If not given, images are not "
|
||||
"extracted.")
|
||||
output_params.add_argument("--layoutmode", "-Y", default="normal", type=str,
|
||||
help="Type of layout to use when generating html {normal,exact,loose}. If normal, "
|
||||
"each line is positioned separately in the html. If exact, each character is "
|
||||
"positioned separately in the html. If loose, same result as normal but with an "
|
||||
"additional newline after each text line. Only used when output_type is html.")
|
||||
output_params.add_argument("--scale", "-s", type=float, default=1.0,
|
||||
help="The amount of zoom to use when generating html file. Only used when output_type "
|
||||
"is html.")
|
||||
output_params.add_argument("--strip-control", "-S", default=False, action="store_true",
|
||||
help="Remove control statement from text. Only used when output_type is xml.")
|
||||
return parser
|
||||
|
||||
|
||||
|
|
|
@ -1,30 +0,0 @@
|
|||
# -*- mode: python -*-
|
||||
|
||||
block_cipher = None
|
||||
|
||||
|
||||
a = Analysis(['pdf2txt.py'],
|
||||
pathex=['C:\\Dev\\Python\\pdfminer.six\\tools'],
|
||||
binaries=[],
|
||||
datas=[],
|
||||
hiddenimports=[],
|
||||
hookspath=[],
|
||||
runtime_hooks=[],
|
||||
excludes=['django','matplotlib','PIL','numpy','qt5'],
|
||||
win_no_prefer_redirects=False,
|
||||
win_private_assemblies=False,
|
||||
cipher=block_cipher)
|
||||
|
||||
pyz = PYZ(a.pure, a.zipped_data,
|
||||
cipher=block_cipher)
|
||||
exe = EXE(pyz,
|
||||
a.scripts,
|
||||
a.binaries,
|
||||
a.zipfiles,
|
||||
a.datas,
|
||||
name='pdf2txt',
|
||||
debug=False,
|
||||
strip=False,
|
||||
upx=True,
|
||||
runtime_tmpdir=None,
|
||||
console=True )
|
|
@ -11,28 +11,34 @@ pdfminer.settings.STRICT = False
|
|||
import pdfminer.high_level
|
||||
import pdfminer.layout
|
||||
|
||||
def compare(file1,file2,**args):
|
||||
if args.get('_py2_no_more_posargs',None) is not None:
|
||||
raise ValueError("Too many positional arguments passed.")
|
||||
logging.basicConfig()
|
||||
|
||||
|
||||
def compare(file1, file2, **kwargs):
|
||||
if '_py2_no_more_posargs' in kwargs is not None:
|
||||
raise DeprecationWarning(
|
||||
'The `_py2_no_more_posargs will be removed on January, 2020. At '
|
||||
'that moment pdfminer.six will stop supporting Python 2. Please '
|
||||
'upgrade to Python 3. For more information see '
|
||||
'https://github.com/pdfminer/pdfminer .six/issues/194')
|
||||
|
||||
# If any LAParams group arguments were passed, create an LAParams object and
|
||||
# populate with given args. Otherwise, set it to None.
|
||||
if args.get('laparams',None) is None:
|
||||
if kwargs.get('laparams', None) is None:
|
||||
laparams = pdfminer.layout.LAParams()
|
||||
for param in ("all_texts", "detect_vertical", "word_margin", "char_margin", "line_margin", "boxes_flow"):
|
||||
paramv = args.get(param, None)
|
||||
paramv = kwargs.get(param, None)
|
||||
if paramv is not None:
|
||||
laparams[param]=paramv
|
||||
args['laparams']=laparams
|
||||
kwargs['laparams']=laparams
|
||||
|
||||
s1=six.StringIO()
|
||||
with open(file1, "rb") as fp:
|
||||
pdfminer.high_level.extract_text_to_fp(fp,s1, **args)
|
||||
pdfminer.high_level.extract_text_to_fp(fp, s1, **kwargs)
|
||||
|
||||
s2=six.StringIO()
|
||||
with open(file2, "rb") as fp:
|
||||
pdfminer.high_level.extract_text_to_fp(fp,s2, **args)
|
||||
pdfminer.high_level.extract_text_to_fp(fp, s2, **kwargs)
|
||||
|
||||
import difflib
|
||||
s1.seek(0)
|
||||
|
@ -41,12 +47,12 @@ def compare(file1,file2,**args):
|
|||
|
||||
import os.path
|
||||
try:
|
||||
extension = os.path.splitext(args['outfile'])[1][1:4]
|
||||
extension = os.path.splitext(kwargs['outfile'])[1][1:4]
|
||||
if extension.lower()=='htm':
|
||||
return difflib.HtmlDiff().make_file(s1,s2)
|
||||
except KeyError:
|
||||
pass
|
||||
return difflib.unified_diff(s1,s2,n=args['context_lines'])
|
||||
return difflib.unified_diff(s1, s2, n=kwargs['context_lines'])
|
||||
|
||||
|
||||
# main
|
||||
|
@ -86,9 +92,11 @@ def main(args=None):
|
|||
P.add_argument("-C", "--disable-caching", default=False, action="store_true", help="Disable caching")
|
||||
P.add_argument("-S", "--strip-control", default=False, action="store_true", help="Strip control in XML mode")
|
||||
|
||||
|
||||
A = P.parse_args(args=args)
|
||||
|
||||
if A.debug:
|
||||
logging.getLogger().setLevel(logging.DEBUG)
|
||||
|
||||
if A.page_numbers:
|
||||
A.page_numbers = set([x-1 for x in A.page_numbers])
|
||||
if A.pagenos:
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue