pdfminer.six/docs/source/faq.rst

42 lines
2.0 KiB
ReStructuredText
Raw Normal View History

.. _faq:
2020-10-11 18:05:26 +00:00
Frequently asked questions
**************************
Why is it called pdfminer.six?
==============================
2020-10-18 10:49:54 +00:00
Pdfminer.six is a fork of the `original pdfminer created by Euske
<https://github.com/euske>`_. Almost all of the code and architecture are in
-fact created by Euske. But, for a long time, this original pdfminer did not
2020-10-18 10:49:54 +00:00
support Python 3. Until 2020 the original pdfminer only supported Python 2.
The original goal of pdfminer.six was to add support for Python 3. This was
done with the `six` package. The `six` package helps to write code that is
2020-10-18 10:49:54 +00:00
compatible with both Python 2 and Python 3. Hence, pdfminer.six.
2020-10-11 18:05:26 +00:00
As of 2020, pdfminer.six dropped the support for Python 2 because it was
`end-of-life <https://www.python.org/doc/sunset-python-2/>`_. While the .six
2020-10-11 18:05:26 +00:00
part is no longer applicable, we kept the name to prevent breaking changes for
existing users.
The current punchline "We fathom PDF" is a `whimsical reference
<https://github.com/pdfminer/pdfminer.six/issues/197#issuecomment-655091942>`_
to the six. Fathom means both deeply understanding something, and a fathom is
also equal to six feet.
How does pdfminer.six compare to other forks of pdfminer?
==========================================================
Pdfminer.six is now an independent and community-maintained package for
extracting text from PDFs with Python. We actively fix bugs (also for PDFs
2020-10-11 18:05:26 +00:00
that don't strictly follow the PDF Reference), add new features and improve
the usability of pdfminer.six. This community separates pdfminer.six from the
other forks of the original pdfminer. PDF as a format is very diverse and
there are countless deviations from the official format. The only way to
support all the PDFs out there is to have a community that actively uses and
2020-10-11 18:05:26 +00:00
improves pdfminer.
Since 2020, the original pdfminer is `dormant
2020-10-12 07:22:41 +00:00
<https://github.com/euske/pdfminer#pdfminer>`_, and pdfminer.six is the fork
which Euske recommends if you need an actively maintained version of pdfminer.