pdfminer.six/samples
Pieter Marsman 1c3047b68b
Remove samples/ directory from source distribution to prevent downloading all pdf's when installing pdfminer.six (#364)
Fixes #363 

* Remove samples/ and docs/ from source distribution. The samples/ dictionairy contains pdf's for testing purposes and the docs/ contain readthedocs documentation and is published online.

* Remove issue-00152-embedded-pdf.pdf because it contains a possible exploit.

See https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Exploit%3AJS%2FShellCode.gen
And https://github.com/pdfminer/pdfminer.six/issues/363

* Added line to CHANGELOG.md

* Remove unused imports
2020-01-24 12:36:02 +01:00
..
contrib Remove samples/ directory from source distribution to prevent downloading all pdf's when installing pdfminer.six (#364) 2020-01-24 12:36:02 +01:00
encryption Drop support for legacy Python 2 (#346) 2020-01-04 16:47:07 +01:00
nonfree Fix failing test on develop & cleaning up test files (#319) 2019-10-26 18:42:33 +02:00
scancode Add a test for the previous fix 2017-10-16 12:35:16 +02:00
Makefile Fixed tests. 2016-06-27 00:00:00 +02:00
README Added: tests for extracting tests from pdfs with Type3 fonts (#205) 2019-10-22 18:15:59 +02:00
font-size-test.pdf Fix bug in computing character bounding box (#348) 2020-01-16 22:15:50 +01:00
jo.pdf add samples, fixed silly bugs. 2007-12-31 05:02:15 +00:00
sampleOneByteIdentityEncode.pdf Adds Test Case 2019-08-10 10:19:20 +05:30
simple1.pdf testcase added 2009-10-24 02:50:07 +00:00
simple2.pdf various cleanup for release. 2008-04-27 11:47:38 +00:00
simple3.pdf test file simple3.pdf added. 2010-08-29 06:39:41 +00:00

README

This directory contains sample PDF files.

These files (including ones in nonfree/ subdirectory) can be
distributed freely but does not come with explicit licensing 
terms or source files.

Here are the credits of the original files:

simple1.pdf:
  (Originally taken from PDF Specification 1.7, 
  Appendix G. "Simple Text String Example" and modified)

simple2.pdf:
  (Originally taken from PDF Specification 1.7, 
  Appendix G. "Simple Graphics Example" and modified)

jo.pdf:
  Kenji Miyazawa (1896-1933, copyright expired)
  Preface of "Haru to Shura"
  (File generated from jo.tex by LaTeX and dvi2pdfm)

--
contrib/matplotlib.pdf
  Copyright 2018, James R Barlow
  Example file created in matplotlib to add a Type3 font to the samples
  Released under the terms of the "LICENSE" file

--
nonfree/cmp_itext_logo.pdf
  Bruno Lowagie
  "iText Logo - Type 3 font"
  http://gitlab.itextsupport.com/itext/sandbox/raw/master/cmpfiles/fonts/cmp_itext_logo.pdf

nonfree/dmca.pdf: 
  U.S. Copyright Office
  The Digital Millenium Copyright Act
  http://www.copyright.gov/legislation/dmca.pdf

nonfree/f1040nr.pdf:
  U.S. Department of the Treasury Internal Revenue Service
  Form 1040-NR, U.S. Nonresident Alien Income Tax Return
  http://www.irs.gov/pub/irs-pdf/f1040nr.pdf

nonfree/i1040nr.pdf:
  U.S. Department of the Treasury Internal Revenue Service
  Instructions for Form 1040-NR, U.S. Nonresident Alien Income Tax Return
  http://www.irs.gov/pub/irs-pdf/i1040nr.pdf

nonfree/kampo.pdf:
  National Priting Bureau of Japan
  Official Gazette, Vol. 4817
  http://kanpou.npb.go.jp/

nonfree/nlp2004slides.pdf:
  Yusuke Shinyama and Satoshi Sekine
  "Named Entity Discovery from Comparable News Corpora"

nonfree/naacl06-shinyama.pdf:
  Yusuke Shinyama and Satoshi Sekine
  "Preemptive Information Extraction using Unrestircted Relation Discovery"

--
Files in the encryption folder have been generated with cpdf 1.7 [http://www.coherentpdf.com/]
from the base.pdf file generated with LibreOffice 4.1.1.2 as follows:

cpdf -encrypt 40bit foo baz base.pdf -o rc4-40.pdf
cpdf -encrypt 128bit foo baz base.pdf -o rc4-128.pdf
cpdf -encrypt AES foo baz base.pdf -o aes-128.pdf
cpdf -encrypt AES foo baz base.pdf -no-encrypt-metadata -o aes-128-m.pdf
cpdf -encrypt AES256 foo baz base.pdf -o aes-256.pdf
cpdf -encrypt AES256 foo baz base.pdf -no-encrypt-metadata -o aes-256-m.pdf