Consistent instructions for how to install and use pdfminer.six (#793)

2022-11-05 16:30:39 +01:00 · 2022-11-05 16:30:39 +01:00 · 769dbb6343
parent ad6587c697
commit 769dbb6343
3 changed files with 33 additions and 11 deletions
--- a/README.md
+++ b/README.md
@ -40,7 +40,7 @@ How to use
 ----------

 * Install Python 3.6 or newer.
-* Install
+* Install pdfminer.six.

  `pip install pdfminer.six`

@ -48,9 +48,18 @@ How to use

  `pip install 'pdfminer.six[image]'`

-* Use command-line interface to extract text from pdf:
+* Use the command-line interface to extract text from pdf.

-  `python pdf2txt.py samples/simple1.pdf`
+  `pdf2txt.py example.pdf`
+
+* Or use it with Python. 
+
+```python
+from pdfminer.high_level import extract_text
+
+text = extract_text("example.pdf")
+print(text)
+```

 Contributing
 ------------
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -59,18 +59,31 @@ Features
 Installation instructions
 =========================

-Before using it, you must install it using Python 3.6 or newer.
+* Install Python 3.6 or newer.
+* Install pdfminer.six.

 ::
+    $ pip install pdfminer.six`

-    $ pip install pdfminer.six
-
-
-Optionally install extra dependencies that are needed to extract jpg images.
+* (Optionally) install extra dependencies for extracting images.

 ::
+    $ pip install 'pdfminer.six[image]'`
+
+* Use the command-line interface to extract text from pdf.
+
+::
+    $ pdf2txt.py example.pdf`
+
+* Or use it with Python.
+
+.. code-block:: python
+
+    from pdfminer.high_level import extract_text
+
+    text = extract_text("example.pdf")
+    print(text)

-    $ pip install 'pdfminer.six[image]'


 Contributing
--- a/docs/source/tutorial/commandline.rst
+++ b/docs/source/tutorial/commandline.rst
@ -18,7 +18,7 @@ pdf2txt.py

 ::

-    $ python tools/pdf2txt.py example.pdf
+    $ pdf2txt.py example.pdf
    all the text from the pdf appears on the command line

 The :ref:`api_pdf2txt` tool extracts all the text from a PDF. It uses layout
@ -29,7 +29,7 @@ dumppdf.py

 ::

-    $ python tools/dumppdf.py -a example.pdf
+    $ dumppdf.py -a example.pdf
    <pdf><object id="1">
    ...
    </object>