Consistent instructions for how to install and use pdfminer.six (#793)
parent
ad6587c697
commit
769dbb6343
15
README.md
15
README.md
|
@ -40,7 +40,7 @@ How to use
|
||||||
----------
|
----------
|
||||||
|
|
||||||
* Install Python 3.6 or newer.
|
* Install Python 3.6 or newer.
|
||||||
* Install
|
* Install pdfminer.six.
|
||||||
|
|
||||||
`pip install pdfminer.six`
|
`pip install pdfminer.six`
|
||||||
|
|
||||||
|
@ -48,9 +48,18 @@ How to use
|
||||||
|
|
||||||
`pip install 'pdfminer.six[image]'`
|
`pip install 'pdfminer.six[image]'`
|
||||||
|
|
||||||
* Use command-line interface to extract text from pdf:
|
* Use the command-line interface to extract text from pdf.
|
||||||
|
|
||||||
`python pdf2txt.py samples/simple1.pdf`
|
`pdf2txt.py example.pdf`
|
||||||
|
|
||||||
|
* Or use it with Python.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from pdfminer.high_level import extract_text
|
||||||
|
|
||||||
|
text = extract_text("example.pdf")
|
||||||
|
print(text)
|
||||||
|
```
|
||||||
|
|
||||||
Contributing
|
Contributing
|
||||||
------------
|
------------
|
||||||
|
|
|
@ -59,18 +59,31 @@ Features
|
||||||
Installation instructions
|
Installation instructions
|
||||||
=========================
|
=========================
|
||||||
|
|
||||||
Before using it, you must install it using Python 3.6 or newer.
|
* Install Python 3.6 or newer.
|
||||||
|
* Install pdfminer.six.
|
||||||
|
|
||||||
::
|
::
|
||||||
|
$ pip install pdfminer.six`
|
||||||
|
|
||||||
$ pip install pdfminer.six
|
* (Optionally) install extra dependencies for extracting images.
|
||||||
|
|
||||||
|
|
||||||
Optionally install extra dependencies that are needed to extract jpg images.
|
|
||||||
|
|
||||||
::
|
::
|
||||||
|
$ pip install 'pdfminer.six[image]'`
|
||||||
|
|
||||||
|
* Use the command-line interface to extract text from pdf.
|
||||||
|
|
||||||
|
::
|
||||||
|
$ pdf2txt.py example.pdf`
|
||||||
|
|
||||||
|
* Or use it with Python.
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
from pdfminer.high_level import extract_text
|
||||||
|
|
||||||
|
text = extract_text("example.pdf")
|
||||||
|
print(text)
|
||||||
|
|
||||||
$ pip install 'pdfminer.six[image]'
|
|
||||||
|
|
||||||
|
|
||||||
Contributing
|
Contributing
|
||||||
|
|
|
@ -18,7 +18,7 @@ pdf2txt.py
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
$ python tools/pdf2txt.py example.pdf
|
$ pdf2txt.py example.pdf
|
||||||
all the text from the pdf appears on the command line
|
all the text from the pdf appears on the command line
|
||||||
|
|
||||||
The :ref:`api_pdf2txt` tool extracts all the text from a PDF. It uses layout
|
The :ref:`api_pdf2txt` tool extracts all the text from a PDF. It uses layout
|
||||||
|
@ -29,7 +29,7 @@ dumppdf.py
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
$ python tools/dumppdf.py -a example.pdf
|
$ dumppdf.py -a example.pdf
|
||||||
<pdf><object id="1">
|
<pdf><object id="1">
|
||||||
...
|
...
|
||||||
</object>
|
</object>
|
||||||
|
|
Loading…
Reference in New Issue