pdfminer.six/samples/contrib
Andrew Baumann 1d1602e0c5
Added feature: page labels (#680)
* port page label code from pdfannots

* add tests and clean up

* more cleanup; harden against non-conforming input

* one more test

* update CHANGELOG

* cleanup & respond to review feedback (incomplete)

* Refactor implementation of get_page_labels() into a NumberTree and PageLabels class.

* PageLabels *is* a NumberTree and should always behave like one. This justifies inheriting its data and behavior. And it simplifies the code a bit more.

* fix type errors and cleanup slightly

 * fix mypy errors (including tweaking code to avoid problematic dynamic types)
 * hoist dict_value from NumberTree (where it may not be a dict) to PageLabels (where it must be)
 * avoid repeated warnings by calling _parse() recursively, and checking sortedness only at the end

Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
2022-02-01 10:08:05 +01:00
..
2b.pdf issue #56 reproduced, solution attempt unsucessful 2017-04-19 14:19:14 +02:00
XIPLAYER0.jb2 Fixes jbig2 writer to write valid jb2 files 2022-01-23 21:41:08 +01:00
issue-00352-asw-oct96-p41.pdf Change Text extraction is not allowed error to warning (#453) 2020-07-11 16:04:11 +02:00
issue-00352-hash-twos-complement.pdf Pack the /P (ermissions) entry from the /Encrypt dictionionary in the file trailer, as unsigned long (#352) 2020-01-07 21:59:13 +01:00
issue-00369-excel.pdf Fix converting path to multiple rectangles (#371) 2020-07-11 17:34:38 +02:00
issue-625-identity-cmap.pdf Add support identity unicode cmap (#626) 2021-10-13 21:52:00 +02:00
issue_566_test_1.pdf Fix extraction of some cjk characters (#593) 2021-08-26 21:05:03 +02:00
issue_566_test_2.pdf Fix extraction of some cjk characters (#593) 2021-08-26 21:05:03 +02:00
matplotlib.pdf Added: tests for extracting tests from pdfs with Type3 fonts (#205) 2019-10-22 18:15:59 +02:00
pagelabels.pdf Added feature: page labels (#680) 2022-02-01 10:08:05 +01:00
pdf-with-jbig2.pdf Added: extraction of JBIG2 encoded images (#311) 2019-10-22 17:37:06 +02:00
pr-00530-ml-lines.pdf Fix .paint_path handling of single line segments (#530) 2021-07-27 18:27:32 +02:00