Remove samples/ directory from source distribution to prevent downloading all pdf's when installing pdfminer.six (#364)

Fixes #363 

* Remove samples/ and docs/ from source distribution. The samples/ dictionairy contains pdf's for testing purposes and the docs/ contain readthedocs documentation and is published online.

* Remove issue-00152-embedded-pdf.pdf because it contains a possible exploit.

See https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Exploit%3AJS%2FShellCode.gen
And https://github.com/pdfminer/pdfminer.six/issues/363

* Added line to CHANGELOG.md

* Remove unused imports
pull/388/head
Pieter Marsman 2020-01-24 12:36:02 +01:00 committed by GitHub
parent bc494ff03c
commit 1c3047b68b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 5 additions and 17 deletions

View File

@ -5,7 +5,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [Unreleased] ## [Unreleased]
Nothing ### Security
- Removed samples/issue-00152-embedded-pdf.pdf because it contains a possible security thread; a javascript enabled object ([#364](https://github.com/pdfminer/pdfminer.six/pull/364))
## [20200121] - 2020-01-21 ## [20200121] - 2020-01-21

View File

@ -4,8 +4,8 @@ include *.txt
include *.md include *.md
include *.py include *.py
graft cmaprsrc graft cmaprsrc
graft docs
graft pdfminer graft pdfminer
graft samples
graft tools graft tools
global-exclude *.pyc global-exclude *.pyc
prune samples
prune docs

View File

@ -1,5 +1,4 @@
from shutil import rmtree from tempfile import NamedTemporaryFile
from tempfile import NamedTemporaryFile, mkdtemp
from helpers import absolute_sample_path from helpers import absolute_sample_path
from tools import dumppdf from tools import dumppdf
@ -37,15 +36,3 @@ class TestDumpPDF():
def test_6(self): def test_6(self):
run('nonfree/naacl06-shinyama.pdf', '-t -a') run('nonfree/naacl06-shinyama.pdf', '-t -a')
def test_embedded_font_filename(self):
"""If UF font file name does not exist, then F should be used
Related issue: https://github.com/pdfminer/pdfminer.six/issues/152
"""
output_dir = mkdtemp()
try:
run('contrib/issue-00152-embedded-pdf.pdf',
'--extract-embedded %s' % output_dir)
finally:
rmtree(output_dir)