Deprecate usage of `if __name__ == "__main__"` in scripts that are not documented. Also deprecate usage of scripts that are only there for testing purposes. (#756)
* Deprecate usage of `if __name__ == "__main__"` in scripts that are not document. Also deprecate usage of scripts that are only there for testing purposes. * Add CHANGELOG.md * Cleanup CHANGELOG.md * Cleanup CHANGELOG.md * Undo deleting conf_glyphlist.py and conf_afm.py and add a deprecation warning insteadpull/772/head
parent
86e34873e4
commit
6cbee25b3e
22
CHANGELOG.md
22
CHANGELOG.md
|
@ -10,6 +10,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||||
|
|
||||||
- Sphinx errors during building of documentation ([#760](https://github.com/pdfminer/pdfminer.six/pull/760))
|
- Sphinx errors during building of documentation ([#760](https://github.com/pdfminer/pdfminer.six/pull/760))
|
||||||
|
|
||||||
|
### Deprecated
|
||||||
|
|
||||||
|
- Usage of `if __name__ == "__main__"` where it was only intended for testing purposes ([#756](https://github.com/pdfminer/pdfminer.six/pull/756))
|
||||||
|
|
||||||
## [20220524]
|
## [20220524]
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
|
@ -86,7 +90,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||||
- Using `io.TextIOBase` as the file to write to ([#616](https://github.com/pdfminer/pdfminer.six/pull/616))
|
- Using `io.TextIOBase` as the file to write to ([#616](https://github.com/pdfminer/pdfminer.six/pull/616))
|
||||||
- Parsing \r\n after the escape character in a literal string ([#616](https://github.com/pdfminer/pdfminer.six/pull/616))
|
- Parsing \r\n after the escape character in a literal string ([#616](https://github.com/pdfminer/pdfminer.six/pull/616))
|
||||||
|
|
||||||
## Removed
|
### Removed
|
||||||
- Support for Python 3.4 and 3.5 ([#522](https://github.com/pdfminer/pdfminer.six/pull/522))
|
- Support for Python 3.4 and 3.5 ([#522](https://github.com/pdfminer/pdfminer.six/pull/522))
|
||||||
- Unused dependency on `sortedcontainers` package ([#525](https://github.com/pdfminer/pdfminer.six/pull/525))
|
- Unused dependency on `sortedcontainers` package ([#525](https://github.com/pdfminer/pdfminer.six/pull/525))
|
||||||
- Support for non-standard output streams that are not binary ([#523](https://github.com/pdfminer/pdfminer.six/pull/523))
|
- Support for non-standard output streams that are not binary ([#523](https://github.com/pdfminer/pdfminer.six/pull/523))
|
||||||
|
@ -152,12 +156,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||||
### Changed
|
### Changed
|
||||||
- Group text lines if they are centered ([#384](https://github.com/pdfminer/pdfminer.six/pull/384))
|
- Group text lines if they are centered ([#384](https://github.com/pdfminer/pdfminer.six/pull/384))
|
||||||
|
|
||||||
## [20200124] - 2020-01-24
|
## [20200124]
|
||||||
|
|
||||||
### Security
|
### Security
|
||||||
- Removed samples/issue-00152-embedded-pdf.pdf because it contains a possible security thread; a javascript enabled object ([#364](https://github.com/pdfminer/pdfminer.six/pull/364))
|
- Removed samples/issue-00152-embedded-pdf.pdf because it contains a possible security thread; a javascript enabled object ([#364](https://github.com/pdfminer/pdfminer.six/pull/364))
|
||||||
|
|
||||||
## [20200121] - 2020-01-21
|
## [20200121]
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
- Interpret two's complement integer as unsigned integer ([#352](https://github.com/pdfminer/pdfminer.six/pull/352))
|
- Interpret two's complement integer as unsigned integer ([#352](https://github.com/pdfminer/pdfminer.six/pull/352))
|
||||||
|
@ -168,20 +172,20 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||||
### Removed
|
### Removed
|
||||||
- The command-line utility latin2ascii.py ([#360](https://github.com/pdfminer/pdfminer.six/pull/360))
|
- The command-line utility latin2ascii.py ([#360](https://github.com/pdfminer/pdfminer.six/pull/360))
|
||||||
|
|
||||||
## [20200104] - 2019-01-04
|
## [20200104]
|
||||||
|
|
||||||
## Removed
|
### Removed
|
||||||
- Support for Python 2 ([#346](https://github.com/pdfminer/pdfminer.six/pull/346))
|
- Support for Python 2 ([#346](https://github.com/pdfminer/pdfminer.six/pull/346))
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
- Enforce pep8 coding style by adding flake8 to CI ([#345](https://github.com/pdfminer/pdfminer.six/pull/345))
|
- Enforce pep8 coding style by adding flake8 to CI ([#345](https://github.com/pdfminer/pdfminer.six/pull/345))
|
||||||
|
|
||||||
## [20191110] - 2019-11-10
|
## [20191110]
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
- Wrong order of text box grouping introduced by PR #315 ([#335](https://github.com/pdfminer/pdfminer.six/pull/335))
|
- Wrong order of text box grouping introduced by PR #315 ([#335](https://github.com/pdfminer/pdfminer.six/pull/335))
|
||||||
|
|
||||||
## [20191107] - 2019-11-07
|
## [20191107]
|
||||||
|
|
||||||
### Deprecated
|
### Deprecated
|
||||||
- The argument `_py2_no_more_posargs` because Python2 is removed on January
|
- The argument `_py2_no_more_posargs` because Python2 is removed on January
|
||||||
|
@ -208,7 +212,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||||
### Removed
|
### Removed
|
||||||
- Files for external applications such as django, cgi and pyinstaller ([#320](https://github.com/pdfminer/pdfminer.six/pull/320))
|
- Files for external applications such as django, cgi and pyinstaller ([#320](https://github.com/pdfminer/pdfminer.six/pull/320))
|
||||||
|
|
||||||
## [20191020] - 2019-10-20
|
## [20191020]
|
||||||
|
|
||||||
### Deprecated
|
### Deprecated
|
||||||
- Support for Python 2 is dropped at January 1st, 2020 ([#307](https://github.com/pdfminer/pdfminer.six/pull/307))
|
- Support for Python 2 is dropped at January 1st, 2020 ([#307](https://github.com/pdfminer/pdfminer.six/pull/307))
|
||||||
|
@ -230,7 +234,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||||
### Changed
|
### Changed
|
||||||
- All dependencies are managed in `setup.py` ([#306](https://github.com/pdfminer/pdfminer.six/pull/306) and [#219](https://github.com/pdfminer/pdfminer.six/pull/219))
|
- All dependencies are managed in `setup.py` ([#306](https://github.com/pdfminer/pdfminer.six/pull/306) and [#219](https://github.com/pdfminer/pdfminer.six/pull/219))
|
||||||
|
|
||||||
## [20181108] - 2018-11-08
|
## [20181108]
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
- Speedup layout analysis ([#141](https://github.com/pdfminer/pdfminer.six/pull/141))
|
- Speedup layout analysis ([#141](https://github.com/pdfminer/pdfminer.six/pull/141))
|
||||||
|
|
|
@ -477,6 +477,15 @@ class CMapParser(PSStackParser[PSKeyword]):
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str]) -> None:
|
def main(argv: List[str]) -> None:
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
|
warn(
|
||||||
|
"The function main() from cmapdb.py will be removed in 2023. It was probably "
|
||||||
|
"introduced for testing purposes a long time ago, and no longer relevant. "
|
||||||
|
"Feel free to create a GitHub issue if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
|
|
||||||
args = argv[1:]
|
args = argv[1:]
|
||||||
for fname in args:
|
for fname in args:
|
||||||
fp = open(fname, "rb")
|
fp = open(fname, "rb")
|
||||||
|
|
|
@ -27,6 +27,48 @@ The following data were extracted from the AFM files:
|
||||||
### END Verbatim copy of the license part
|
### END Verbatim copy of the license part
|
||||||
|
|
||||||
# flake8: noqa
|
# flake8: noqa
|
||||||
|
from typing import Dict
|
||||||
|
|
||||||
|
|
||||||
|
def convert_font_metrics(path: str) -> None:
|
||||||
|
"""Convert an AFM file to a mapping of font metrics.
|
||||||
|
|
||||||
|
See below for the output.
|
||||||
|
"""
|
||||||
|
fonts = {}
|
||||||
|
with open(path, "r") as fileinput:
|
||||||
|
for line in fileinput.readlines():
|
||||||
|
f = line.strip().split(" ")
|
||||||
|
if not f:
|
||||||
|
continue
|
||||||
|
k = f[0]
|
||||||
|
if k == "FontName":
|
||||||
|
fontname = f[1]
|
||||||
|
props = {"FontName": fontname, "Flags": 0}
|
||||||
|
chars: Dict[int, int] = {}
|
||||||
|
fonts[fontname] = (props, chars)
|
||||||
|
elif k == "C":
|
||||||
|
cid = int(f[1])
|
||||||
|
if 0 <= cid and cid <= 255:
|
||||||
|
width = int(f[4])
|
||||||
|
chars[cid] = width
|
||||||
|
elif k in ("CapHeight", "XHeight", "ItalicAngle", "Ascender", "Descender"):
|
||||||
|
k = {"Ascender": "Ascent", "Descender": "Descent"}.get(k, k)
|
||||||
|
props[k] = float(f[1])
|
||||||
|
elif k in ("FontName", "FamilyName", "Weight"):
|
||||||
|
k = {"FamilyName": "FontFamily", "Weight": "FontWeight"}.get(k, k)
|
||||||
|
props[k] = f[1]
|
||||||
|
elif k == "IsFixedPitch":
|
||||||
|
if f[1].lower() == "true":
|
||||||
|
props["Flags"] = 64
|
||||||
|
elif k == "FontBBox":
|
||||||
|
props[k] = tuple(map(float, f[1:5]))
|
||||||
|
print("# -*- python -*-")
|
||||||
|
print("FONT_METRICS = {")
|
||||||
|
for (fontname, (props, chars)) in fonts.items():
|
||||||
|
print(" {!r}: {!r},".format(fontname, (props, chars)))
|
||||||
|
print("}")
|
||||||
|
|
||||||
|
|
||||||
FONT_METRICS = {
|
FONT_METRICS = {
|
||||||
"Courier": (
|
"Courier": (
|
||||||
|
|
|
@ -51,6 +51,32 @@ The following data was taken by
|
||||||
# (1) glyph name
|
# (1) glyph name
|
||||||
# (2) Unicode scalar value
|
# (2) Unicode scalar value
|
||||||
|
|
||||||
|
|
||||||
|
def convert_glyphlist(path: str) -> None:
|
||||||
|
"""Convert a glyph list into a python representation.
|
||||||
|
|
||||||
|
See output below.
|
||||||
|
"""
|
||||||
|
state = 0
|
||||||
|
with open(path, "r") as fileinput:
|
||||||
|
for line in fileinput.readlines():
|
||||||
|
line = line.strip()
|
||||||
|
if not line or line.startswith("#"):
|
||||||
|
if state == 1:
|
||||||
|
state = 2
|
||||||
|
print("}\n")
|
||||||
|
print(line)
|
||||||
|
continue
|
||||||
|
if state == 0:
|
||||||
|
print("\nglyphname2unicode = {")
|
||||||
|
state = 1
|
||||||
|
(name, x) = line.split(";")
|
||||||
|
codes = x.split(" ")
|
||||||
|
print(
|
||||||
|
" {!r}: u'{}',".format(name, "".join("\\u%s" % code for code in codes))
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
glyphname2unicode = {
|
glyphname2unicode = {
|
||||||
"A": "\u0041",
|
"A": "\u0041",
|
||||||
"AE": "\u00C6",
|
"AE": "\u00C6",
|
||||||
|
|
|
@ -19,12 +19,12 @@ from typing import (
|
||||||
|
|
||||||
from . import settings
|
from . import settings
|
||||||
from .cmapdb import CMap
|
from .cmapdb import CMap
|
||||||
from .cmapdb import IdentityUnicodeMap
|
|
||||||
from .cmapdb import CMapBase
|
from .cmapdb import CMapBase
|
||||||
from .cmapdb import CMapDB
|
from .cmapdb import CMapDB
|
||||||
from .cmapdb import CMapParser
|
from .cmapdb import CMapParser
|
||||||
from .cmapdb import UnicodeMap
|
|
||||||
from .cmapdb import FileUnicodeMap
|
from .cmapdb import FileUnicodeMap
|
||||||
|
from .cmapdb import IdentityUnicodeMap
|
||||||
|
from .cmapdb import UnicodeMap
|
||||||
from .encodingdb import EncodingDB
|
from .encodingdb import EncodingDB
|
||||||
from .encodingdb import name2unicode
|
from .encodingdb import name2unicode
|
||||||
from .fontmetrics import FONT_METRICS
|
from .fontmetrics import FONT_METRICS
|
||||||
|
@ -1187,6 +1187,15 @@ class PDFCIDFont(PDFFont):
|
||||||
|
|
||||||
|
|
||||||
def main(argv: List[str]) -> None:
|
def main(argv: List[str]) -> None:
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
|
warn(
|
||||||
|
"The function main() from pdffont.py will be removed in 2023. It was probably "
|
||||||
|
"introduced for testing purposes a long time ago, and no longer relevant. "
|
||||||
|
"Feel free to create a GitHub issue if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
|
|
||||||
for fname in argv[1:]:
|
for fname in argv[1:]:
|
||||||
fp = open(fname, "rb")
|
fp = open(fname, "rb")
|
||||||
font = CFFFont(fname, fp)
|
font = CFFFont(fname, fp)
|
||||||
|
|
|
@ -168,7 +168,3 @@ class TestExtractPages(unittest.TestCase):
|
||||||
elements = [element for element in page if isinstance(element, LTTextContainer)]
|
elements = [element for element in page if isinstance(element, LTTextContainer)]
|
||||||
self.assertEqual(len(elements), 1)
|
self.assertEqual(len(elements), 1)
|
||||||
self.assertEqual(elements[0].get_text(), "Text1\nText2\nText3\n")
|
self.assertEqual(elements[0].get_text(), "Text1\nText2\nText3\n")
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
unittest.main()
|
|
||||||
|
|
|
@ -2,6 +2,7 @@
|
||||||
|
|
||||||
import sys
|
import sys
|
||||||
import fileinput
|
import fileinput
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
|
|
||||||
def main(argv):
|
def main(argv):
|
||||||
|
@ -41,4 +42,11 @@ def main(argv):
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
warn(
|
||||||
|
"The file conf_afm.py will be removed in 2023. Its functionality is"
|
||||||
|
"moved to pdfminer/font_metrics.py. Feel free to create a GitHub "
|
||||||
|
"issue if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
|
|
||||||
sys.exit(main(sys.argv)) # type: ignore[no-untyped-call]
|
sys.exit(main(sys.argv)) # type: ignore[no-untyped-call]
|
||||||
|
|
|
@ -1,8 +1,8 @@
|
||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
import sys
|
|
||||||
import pickle as pickle
|
|
||||||
import codecs
|
import codecs
|
||||||
|
import pickle as pickle
|
||||||
|
import sys
|
||||||
|
|
||||||
|
|
||||||
class CMapConverter:
|
class CMapConverter:
|
||||||
|
@ -19,6 +19,7 @@ class CMapConverter:
|
||||||
|
|
||||||
def get_maps(self, enc):
|
def get_maps(self, enc):
|
||||||
if enc.endswith("-H"):
|
if enc.endswith("-H"):
|
||||||
|
|
||||||
(hmapenc, vmapenc) = (enc, None)
|
(hmapenc, vmapenc) = (enc, None)
|
||||||
elif enc == "H":
|
elif enc == "H":
|
||||||
(hmapenc, vmapenc) = ("H", "V")
|
(hmapenc, vmapenc) = ("H", "V")
|
||||||
|
|
|
@ -2,6 +2,7 @@
|
||||||
|
|
||||||
import sys
|
import sys
|
||||||
import fileinput
|
import fileinput
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
|
|
||||||
def main(argv):
|
def main(argv):
|
||||||
|
@ -23,4 +24,10 @@ def main(argv):
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
warn(
|
||||||
|
"The file conf_glpyhlist.py will be removed in 2023. Its functionality"
|
||||||
|
"is moved to pdfminer/glyphlist.py. Feel free to create a GitHub issue "
|
||||||
|
"if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
sys.exit(main(sys.argv)) # type: ignore[no-untyped-call]
|
sys.exit(main(sys.argv)) # type: ignore[no-untyped-call]
|
||||||
|
|
|
@ -7,10 +7,18 @@ import io
|
||||||
import logging
|
import logging
|
||||||
import sys
|
import sys
|
||||||
from typing import Any, Iterable, List, Optional
|
from typing import Any, Iterable, List, Optional
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
import pdfminer.settings
|
import pdfminer.settings
|
||||||
from pdfminer import high_level, layout
|
from pdfminer import high_level, layout
|
||||||
|
|
||||||
|
warn(
|
||||||
|
"The file pdfdiff.py will be removed in 2023. It was probably introduced for "
|
||||||
|
"testing purposes a long time ago, and no longer relevant. Feel free to create a "
|
||||||
|
"GitHub issue if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
|
|
||||||
pdfminer.settings.STRICT = False
|
pdfminer.settings.STRICT = False
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -4,18 +4,25 @@
|
||||||
# print some stats to stdout
|
# print some stats to stdout
|
||||||
# Usage: pdfstats.py <PDF-filename>
|
# Usage: pdfstats.py <PDF-filename>
|
||||||
|
|
||||||
import sys
|
|
||||||
import os
|
|
||||||
import collections
|
import collections
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
from typing import Any, Counter, Iterator, List
|
from typing import Any, Counter, Iterator, List
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
from pdfminer.pdfparser import PDFParser
|
|
||||||
from pdfminer.pdfdocument import PDFDocument, PDFTextExtractionNotAllowed
|
|
||||||
from pdfminer.pdfpage import PDFPage
|
|
||||||
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
|
|
||||||
from pdfminer.converter import PDFPageAggregator
|
from pdfminer.converter import PDFPageAggregator
|
||||||
from pdfminer.layout import LAParams, LTContainer
|
from pdfminer.layout import LAParams, LTContainer
|
||||||
|
from pdfminer.pdfdocument import PDFDocument, PDFTextExtractionNotAllowed
|
||||||
|
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
|
||||||
|
from pdfminer.pdfpage import PDFPage
|
||||||
|
from pdfminer.pdfparser import PDFParser
|
||||||
|
|
||||||
|
warn(
|
||||||
|
"The file pdfstats.py will be removed in 2023. It was probably introduced for "
|
||||||
|
"testing purposes a long time ago, and no longer relevant. Feel free to create a "
|
||||||
|
"GitHub issue if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
|
|
||||||
_, SCRIPT = os.path.split(__file__)
|
_, SCRIPT = os.path.split(__file__)
|
||||||
|
|
||||||
|
|
|
@ -2,6 +2,15 @@
|
||||||
import sys
|
import sys
|
||||||
from typing import List
|
from typing import List
|
||||||
|
|
||||||
|
from warnings import warn
|
||||||
|
|
||||||
|
warn(
|
||||||
|
"The file prof.py will be removed in 2023. It was probably introduced for "
|
||||||
|
"testing purposes a long time ago, and no longer relevant. Feel free to create a "
|
||||||
|
"GitHub issue if you disagree.",
|
||||||
|
DeprecationWarning,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def prof_main(argv: List[str]) -> int:
|
def prof_main(argv: List[str]) -> int:
|
||||||
import hotshot.stats # type: ignore[import]
|
import hotshot.stats # type: ignore[import]
|
||||||
|
|
Loading…
Reference in New Issue