From 7e91d4ec6d56cd9e5d130e2fceb38d2b335c3694 Mon Sep 17 00:00:00 2001 From: Pieter Marsman Date: Sun, 8 Mar 2020 14:53:16 +0100 Subject: [PATCH] Improve docs and github templates --- .github/ISSUE_TEMPLATE/bug_report.md | 18 +++++++------- .github/ISSUE_TEMPLATE/feature_request.md | 15 ++++++------ .github/pull_request_template.md | 29 ++++++++++++++++------- CONTRIBUTING.md | 1 + README.md | 23 ++++++++---------- docs/source/api/highlevel.rst | 7 ++++++ 6 files changed, 56 insertions(+), 37 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md index 2fa183f..33c91d1 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -1,20 +1,20 @@ --- name: Bug report -about: Create a report to help us improve +about: Report a bug title: '' labels: bug assignees: '' --- -**Describe the bug** -A clear and concise description of what the bug is. +**Bug report** -**To Reproduce** +Thanks for finding the bug! To help us fix it, please make sure that you +include the following information: -1. If any, include the code that you are using -2. If any, include the command line statements that you are using -3. If you have problems with a specific pdf file, include that pdf file +- A description of the bug +- Steps to reproduce the bug. Try to minimize the number of steps needed. + Include the command and/or script that you use. Also include the PDF that + you use. +- If relevant, include the output and/or error stacktrace. -**Expected behavior** -A clear and concise description of what you expected to happen. diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md index 110953e..296a3c6 100644 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -1,17 +1,18 @@ --- name: Feature request -about: Suggest an improvement for this project +about: Request a new feature title: '' labels: enhancement assignees: '' --- -**Is your feature request related to a problem? Please describe.** -A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] +**Feature request** -**Describe the solution you'd like** -A clear and concise description of what you want to happen. +Thanks for your suggestion on improving pdfminer.six. To helps us discuss and +implement this request, please make sure to include the following information: -**Describe alternatives you've considered** -A clear and concise description of any alternative solutions or features you've considered. +- A description of the feature you would like to have +- If relevant, the context that you are in. What are you trying to achieve? +- If possible, an example of what you want to achieve. Include the PDF that + you are working on. Include the output that you would like to have. diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index 8cfa7f3..ed037a3 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -1,17 +1,30 @@ -**Description** +**Pull request** -Please include a summary of the change and which issue is fixed. If this does not fix an issue, then first create a new issue. Please also include relevant motivation and context. +Thanks for improving pdfminer.six! Please include the following information to +help us discuss and merge this PR: -Fixes # (issue) +- A description of why this PR is needed. What does it fix? What does it + improve? +- A summary of the things that this PR changes. +- Reference the issues that this PR fixes (use the fixes #(issue nr) syntax). + If this PR does not fix any issue, create the issue first and mention that + you are willing to work on it. **How Has This Been Tested?** -Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Include an example pdf if you have one. +Please describe the tests that you ran to verify your changes. Provide +instructions so we can reproduce. Include an example pdf if you have one. **Checklist** -- [ ] I have added tests that prove my fix is effective or that my feature works -- [ ] I have updated the [README.md](../README.md) and other documentation, or I am sure that this is not necessary -- [ ] I have added a consice human-readable description of the change to [CHANGELOG.md](../CHANGELOG.md) +- [ ] I have added tests that prove my fix is effective or that my feature + works - [ ] I have added docstrings to newly created methods and classes -- [ ] I have optimized the code at least one time after creating the initial version +- [ ] I have optimized the code at least one time after creating the initial + version +- [ ] I have updated the [README.md](../README.md) or I am verified that this + is not necessary +- [ ] I have updated the [readthedocs](../docs/source) documentation or I + verified that this is not necessary +- [ ] I have added a consice human-readable description of the change to + [CHANGELOG.md](../CHANGELOG.md) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 071a306..79e3dc6 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -15,6 +15,7 @@ Any contribution is appreciated! You might want to: issue. * Fix issues by [creating pull requests](https://help.github.com/en/articles/creating-a-pull-request). * Help others by sharing your thoughs in comments on issues and pull requests. +* Join the chat on [gitter](https://gitter.im/pdfminer-six/Lobby) ## Guidelines for creating issues diff --git a/README.md b/README.md index b5f8675..234e245 100644 --- a/README.md +++ b/README.md @@ -5,15 +5,15 @@ pdfminer.six [![PyPI version](https://img.shields.io/pypi/v/pdfminer.six.svg)](https://pypi.python.org/pypi/pdfminer.six/) [![gitter](https://badges.gitter.im/pdfminer-six/Lobby.svg)](https://gitter.im/pdfminer-six/Lobby?utm_source=badge&utm_medium) -Pdfminer.six is an community maintained fork of the original PDFMiner. It is a -tool for extracting information from PDF documents. -Unlike other PDF-related tools, it focuses entirely on getting -and analyzing text data. Pdfminer.six allows one to obtain -the exact location of text in a page, as well as -other information such as fonts or lines. -It includes a PDF converter that can transform PDF files -into other text formats (such as HTML). It has an extensible -PDF parser that can be used for other purposes than text analysis. +Pdfminer.six is a community maintained fork of the original PDFMiner. It is a +tool for extracting information from PDF documents. It focuses on getting +and analyzing text data. Pdfminer.six extracts the text from a page directly +from the sourcecode of the PDF. It can also be used to get the exact location, +font or color of the text. + +It is build in a modular way such that each component of pdfminer.six can be +replaced easily. You can inmplement your own interpreter or rendering device +to use the power of pdfminer.six for other purposes that text analysis. Check out the full documentation on [Read the Docs](https://pdfminersix.readthedocs.io). @@ -29,7 +29,7 @@ Features * Various font types (Type1, TrueType, Type3, and CID) support. * Support for extracting images (JPG, JBIG2 and Bitmaps). * Support for RC4 and AES encryption. - * Outline (TOC) extraction. + * Table of contents extraction. * Tagged contents extraction. * Automatic layout analysis. @@ -45,9 +45,6 @@ How to use * Use command-line interface to extract text from pdf: `python pdf2txt.py samples/simple1.pdf` - -* Check out more examples and documentation on -[Read the Docs](https://pdfminersix.readthedocs.io). Contributing diff --git a/docs/source/api/highlevel.rst b/docs/source/api/highlevel.rst index 4f34b46..9d98ba6 100644 --- a/docs/source/api/highlevel.rst +++ b/docs/source/api/highlevel.rst @@ -19,3 +19,10 @@ extract_text_to_fp .. currentmodule:: pdfminer.high_level .. autofunction:: extract_text_to_fp + + +extract_pages +============= + +.. currentmodule:: pdfminer.high_level +.. autofunction:: extract_pages \ No newline at end of file