From d04c38fb8d55f52413609b8d19dfe2aed4d3f65e Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Sun, 11 Oct 2020 20:04:57 +0200 Subject: [PATCH 1/8] Add punchline to readme --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index c78565a..101bec0 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,8 @@ pdfminer.six [![PyPI version](https://img.shields.io/pypi/v/pdfminer.six.svg)](https://pypi.python.org/pypi/pdfminer.six/) [![gitter](https://badges.gitter.im/pdfminer-six/Lobby.svg)](https://gitter.im/pdfminer-six/Lobby?utm_source=badge&utm_medium) +*We fathom PDF* + Pdfminer.six is a community maintained fork of the original PDFMiner. It is a tool for extracting information from PDF documents. It focuses on getting and analyzing text data. Pdfminer.six extracts the text from a page directly From bbc01f749a3ea649c5d28e719b51f66ec711b88a Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Sun, 11 Oct 2020 20:05:11 +0200 Subject: [PATCH 2/8] Add punchline to docs --- docs/source/index.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/source/index.rst b/docs/source/index.rst index dd06e2d..8030683 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -13,6 +13,7 @@ Welcome to pdfminer.six's documentation! :target: https://gitter.im/pdfminer-six/Lobby?utm_source=badge&utm_medium :alt: gitter badge +We fathom PDF. Pdfminer.six is a python package for extracting information from PDF documents. From 14cc66ae6d2da88695aa10bc52925231ff74ae19 Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Sun, 11 Oct 2020 20:05:26 +0200 Subject: [PATCH 3/8] Add frequently asked questions --- docs/source/faq.rst | 43 +++++++++++++++++++++++++++++ docs/source/index.rst | 1 + docs/source/reference/highlevel.rst | 4 ++- 3 files changed, 47 insertions(+), 1 deletion(-) create mode 100644 docs/source/faq.rst diff --git a/docs/source/faq.rst b/docs/source/faq.rst new file mode 100644 index 0000000..115593f --- /dev/null +++ b/docs/source/faq.rst @@ -0,0 +1,43 @@ +.. _fac: + +Frequently asked questions +************************** + +Why is it called pdfminer.six? +============================== + +Pdfminer.six is a for of the `original pdfminer created by Euske +`_. Almost all of the code and architecture is in +fact created by Euske. But, for a long time this original pdfminer did not +support Python 3. Untill 2020 the original pdfminer only supported Python 2. + +Pdfminer.six started as a for of the original pdfminer with the goal of adding +support for Python 3. This was done with the six package. The six package helps +to write code that is compatible with both Python 2 and Python 3. Hence, +pdfminer.six. + +As of 2020, pdfminer.six dropped the support for Python 2 because it was +`end-of-live `_. While the .six +part is no longer applicable, we kept the name to prevent breaking changes for +existing users. + +The current punchline "We fathom PDF" is a `whimsical reference +`_ +to the six. Fathom means both deeply understanding something, and a fathom is +also equal to six feet. + +How does pdfminer.six compare to other forks of pdfminer? +========================================================== + +Pdfminer.six is now an independent and community maintained package for +extracting text from PDF's with Python. We actively fix bugs (also for PDF's +that don't strictly follow the PDF Reference), add new features and improve +the usability of pdfminer.six. This community separates pdfminer.six from the +other forks of the original pdfminer. PDF as a format is very diverse and +there are countless deviations from the official format. The only way to +support all the PDF's out there is to have a community that actively uses and +improves pdfminer. + +Since 2020, the original pdfminer is `dormant +`_, and pdfminer.six is the +recommended by Euske if you need an actively maintained version of pdfminer. \ No newline at end of file diff --git a/docs/source/index.rst b/docs/source/index.rst index 8030683..d73fc04 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -39,6 +39,7 @@ pdfminer.six. howto/index topic/index reference/index + faq Features diff --git a/docs/source/reference/highlevel.rst b/docs/source/reference/highlevel.rst index 9d98ba6..b764e90 100644 --- a/docs/source/reference/highlevel.rst +++ b/docs/source/reference/highlevel.rst @@ -25,4 +25,6 @@ extract_pages ============= .. currentmodule:: pdfminer.high_level -.. autofunction:: extract_pages \ No newline at end of file +.. autofunction:: extract_pages + +.. _api_extract_pages: \ No newline at end of file From 4be9757b86368e13a11a36c5dde4e831f5a58625 Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Mon, 12 Oct 2020 09:20:30 +0200 Subject: [PATCH 4/8] Update docs/source/faq.rst Co-authored-by: Jake Stockwin --- docs/source/faq.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/faq.rst b/docs/source/faq.rst index 115593f..1557432 100644 --- a/docs/source/faq.rst +++ b/docs/source/faq.rst @@ -1,4 +1,4 @@ -.. _fac: +.. _faq: Frequently asked questions ************************** @@ -40,4 +40,4 @@ improves pdfminer. Since 2020, the original pdfminer is `dormant `_, and pdfminer.six is the -recommended by Euske if you need an actively maintained version of pdfminer. \ No newline at end of file +recommended by Euske if you need an actively maintained version of pdfminer. From a805653a833b2f49d00538e59d49e8821149e12b Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Mon, 12 Oct 2020 09:20:37 +0200 Subject: [PATCH 5/8] Update docs/source/faq.rst Co-authored-by: Jake Stockwin --- docs/source/faq.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/faq.rst b/docs/source/faq.rst index 1557432..516aa80 100644 --- a/docs/source/faq.rst +++ b/docs/source/faq.rst @@ -11,7 +11,7 @@ Pdfminer.six is a for of the `original pdfminer created by Euske fact created by Euske. But, for a long time this original pdfminer did not support Python 3. Untill 2020 the original pdfminer only supported Python 2. -Pdfminer.six started as a for of the original pdfminer with the goal of adding +Pdfminer.six started as a fork of the original pdfminer with the goal of adding support for Python 3. This was done with the six package. The six package helps to write code that is compatible with both Python 2 and Python 3. Hence, pdfminer.six. From e59b1bca2f764de8e023b49c70d64e80c1bd6b78 Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Mon, 12 Oct 2020 09:20:43 +0200 Subject: [PATCH 6/8] Update docs/source/faq.rst Co-authored-by: Jake Stockwin --- docs/source/faq.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/faq.rst b/docs/source/faq.rst index 516aa80..f74af36 100644 --- a/docs/source/faq.rst +++ b/docs/source/faq.rst @@ -17,7 +17,7 @@ to write code that is compatible with both Python 2 and Python 3. Hence, pdfminer.six. As of 2020, pdfminer.six dropped the support for Python 2 because it was -`end-of-live `_. While the .six +`end-of-life `_. While the .six part is no longer applicable, we kept the name to prevent breaking changes for existing users. From 599f0391b5f0f75cd72adf61a5d9db74045ba828 Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Mon, 12 Oct 2020 09:22:41 +0200 Subject: [PATCH 7/8] Update faq.rst --- docs/source/faq.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/faq.rst b/docs/source/faq.rst index f74af36..546205e 100644 --- a/docs/source/faq.rst +++ b/docs/source/faq.rst @@ -39,5 +39,5 @@ support all the PDF's out there is to have a community that actively uses and improves pdfminer. Since 2020, the original pdfminer is `dormant -`_, and pdfminer.six is the -recommended by Euske if you need an actively maintained version of pdfminer. +`_, and pdfminer.six is the fork +which Euske recommends if you need an actively maintained version of pdfminer. From c66eca3c2933222870056366e8f3b506fccae553 Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Sun, 18 Oct 2020 12:49:54 +0200 Subject: [PATCH 8/8] Update faq.rst --- docs/source/faq.rst | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/docs/source/faq.rst b/docs/source/faq.rst index 546205e..5a742d6 100644 --- a/docs/source/faq.rst +++ b/docs/source/faq.rst @@ -6,15 +6,13 @@ Frequently asked questions Why is it called pdfminer.six? ============================== -Pdfminer.six is a for of the `original pdfminer created by Euske +Pdfminer.six is a fork of the `original pdfminer created by Euske `_. Almost all of the code and architecture is in fact created by Euske. But, for a long time this original pdfminer did not -support Python 3. Untill 2020 the original pdfminer only supported Python 2. - -Pdfminer.six started as a fork of the original pdfminer with the goal of adding -support for Python 3. This was done with the six package. The six package helps -to write code that is compatible with both Python 2 and Python 3. Hence, -pdfminer.six. +support Python 3. Until 2020 the original pdfminer only supported Python 2. +The original goal of pdfminer.six was to add support for Python 3. This was +done with the six package. The six package helps to write code that is +compatible with both Python 2 and Python 3. Hence, pdfminer.six. As of 2020, pdfminer.six dropped the support for Python 2 because it was `end-of-life `_. While the .six