Community maintained fork of pdfminer - we fathom PDF
 
 
Go to file
zacc806 4d38e2d158 Update README.md 2023-08-07 18:30:27 +06:00
.github Run black locally with nox (#776) 2022-06-26 18:25:28 +02:00
cmaprsrc Fix typos 2016-09-13 16:25:09 +02:00
docs Add FAQ about special characters (#829) 2022-11-05 17:22:08 +01:00
pdfminer Storing Bezier path and dashing style of line in LTCurve (#801) 2022-11-06 16:50:37 +01:00
samples Ignore empty characters when analyzing layout (#689) 2022-02-22 21:20:26 +01:00
tests Storing Bezier path and dashing style of line in LTCurve (#801) 2022-11-06 16:50:37 +01:00
tools Deprecate usage of `if __name__ == "__main__"` in scripts that are not documented. Also deprecate usage of scripts that are only there for testing purposes. (#756) 2022-06-25 23:11:10 +02:00
.flake8 Check blackness in github actions (#711) 2022-02-11 22:46:51 +01:00
.gitignore Update development tools: travis ci to github actions, tox to nox, nose to pytest (#704) 2022-02-02 22:24:32 +01:00
CHANGELOG.md Storing Bezier path and dashing style of line in LTCurve (#801) 2022-11-06 16:50:37 +01:00
CONTRIBUTING.md Run black locally with nox (#776) 2022-06-26 18:25:28 +02:00
LICENSE Added: LICENSE 2016-09-11 23:38:18 +09:00
MANIFEST.in Remove samples/ directory from source distribution to prevent downloading all pdf's when installing pdfminer.six (#364) 2020-01-24 12:36:02 +01:00
Makefile Add github action for releasing to pypi if git tag is added. (#727) 2022-03-19 20:46:00 +01:00
README.md Update README.md 2023-08-07 18:30:27 +06:00
mypy.ini Use charset-normalizer instead of chardet (#744) 2022-04-20 21:42:50 +02:00
noxfile.py Run black locally with nox (#776) 2022-06-26 18:25:28 +02:00
requirements.txt Committed script 2023-08-07 18:27:38 +06:00
setup.py Run black locally with nox (#776) 2022-06-26 18:25:28 +02:00
some.py Committed script 2023-08-07 18:27:38 +06:00

README.md

How to use

  • Install Python 3.6 or newer.

  • Install pdfminer.six.

    pip install pdfminer.six

  • (Optionally) install extra dependencies for extracting images.

    pip install 'pdfminer.six[image]'

  • Install pytesseract.

sudo apt install tesseract-ocr

  • Install poppler.

sudo apt install poppler-utils