Consistent instructions for how to install and use pdfminer.six (#793)
parent
ad6587c697
commit
769dbb6343
15
README.md
15
README.md
|
@ -40,7 +40,7 @@ How to use
|
|||
----------
|
||||
|
||||
* Install Python 3.6 or newer.
|
||||
* Install
|
||||
* Install pdfminer.six.
|
||||
|
||||
`pip install pdfminer.six`
|
||||
|
||||
|
@ -48,9 +48,18 @@ How to use
|
|||
|
||||
`pip install 'pdfminer.six[image]'`
|
||||
|
||||
* Use command-line interface to extract text from pdf:
|
||||
* Use the command-line interface to extract text from pdf.
|
||||
|
||||
`python pdf2txt.py samples/simple1.pdf`
|
||||
`pdf2txt.py example.pdf`
|
||||
|
||||
* Or use it with Python.
|
||||
|
||||
```python
|
||||
from pdfminer.high_level import extract_text
|
||||
|
||||
text = extract_text("example.pdf")
|
||||
print(text)
|
||||
```
|
||||
|
||||
Contributing
|
||||
------------
|
||||
|
|
|
@ -59,18 +59,31 @@ Features
|
|||
Installation instructions
|
||||
=========================
|
||||
|
||||
Before using it, you must install it using Python 3.6 or newer.
|
||||
* Install Python 3.6 or newer.
|
||||
* Install pdfminer.six.
|
||||
|
||||
::
|
||||
$ pip install pdfminer.six`
|
||||
|
||||
$ pip install pdfminer.six
|
||||
|
||||
|
||||
Optionally install extra dependencies that are needed to extract jpg images.
|
||||
* (Optionally) install extra dependencies for extracting images.
|
||||
|
||||
::
|
||||
$ pip install 'pdfminer.six[image]'`
|
||||
|
||||
* Use the command-line interface to extract text from pdf.
|
||||
|
||||
::
|
||||
$ pdf2txt.py example.pdf`
|
||||
|
||||
* Or use it with Python.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from pdfminer.high_level import extract_text
|
||||
|
||||
text = extract_text("example.pdf")
|
||||
print(text)
|
||||
|
||||
$ pip install 'pdfminer.six[image]'
|
||||
|
||||
|
||||
Contributing
|
||||
|
|
|
@ -18,7 +18,7 @@ pdf2txt.py
|
|||
|
||||
::
|
||||
|
||||
$ python tools/pdf2txt.py example.pdf
|
||||
$ pdf2txt.py example.pdf
|
||||
all the text from the pdf appears on the command line
|
||||
|
||||
The :ref:`api_pdf2txt` tool extracts all the text from a PDF. It uses layout
|
||||
|
@ -29,7 +29,7 @@ dumppdf.py
|
|||
|
||||
::
|
||||
|
||||
$ python tools/dumppdf.py -a example.pdf
|
||||
$ dumppdf.py -a example.pdf
|
||||
<pdf><object id="1">
|
||||
...
|
||||
</object>
|
||||
|
|
Loading…
Reference in New Issue