pdfminer.six/pdfminer/runlength.py


#
# RunLength decoder (Adobe version) implementation based on PDF Reference
# version 1.4 section 3.3.4.
#
#  * public domain *
#

import six


def rldecode(data):
    """
    RunLength decoder (Adobe version) implementation based on PDF Reference
    version 1.4 section 3.3.4:
        The RunLengthDecode filter decodes data that has been encoded in a
        simple byte-oriented format based on run length. The encoded data
        is a sequence of runs, where each run consists of a length byte
        followed by 1 to 128 bytes of data. If the length byte is in the
        range 0 to 127, the following length + 1 (1 to 128) bytes are
        copied literally during decompression. If length is in the range
        129 to 255, the following single byte is to be copied 257 - length
        (2 to 128) times during decompression. A length value of 128
        denotes EOD.
    """
    decoded = b''
    i = 0
    while i < len(data):
        length = six.indexbytes(data, i)
        if length == 128:
            break
        if length >= 0 and length < 128:
            for j in range(i+1, (i+1)+(length+1)):
                decoded += six.int2byte(six.indexbytes(data, j))
            i = (i+1) + (length+1)
        if length > 128:
            run = six.int2byte(six.indexbytes(data, i+1))*(257-length)
            decoded += run
            i = (i+1) + 1
    return decoded
Removing all the "#!/usr/bin/env python" lines, they do not need for … (#34) * Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: #19. * Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env Removed also the shebang from pdfminer/psparser.py file. 2016-11-08 19:01:11 +00:00
Added RunLengthDecode filter by Troy Bollinger. git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@167 1aa58f4a-7d42-0410-adbc-911cccaed67c 2009-12-24 11:51:43 +00:00			`#`
			`# RunLength decoder (Adobe version) implementation based on PDF Reference`
			`# version 1.4 section 3.3.4.`
			`#`
			`# * public domain *`
			`#`

Enforce pep8 coding-style (#345) * Code Refractor: Use code-style enforcement #312 * Add flake8 to travis-ci * Remove python 2 3 comment on six library. 891 errors > 870 errors. * Remove class and functions comments that consist of just the name. 870 errors > 855 errors. * Fix flake8 errors in pdftypes.py. 855 errors > 833 errors. * Moving flake8 testing from .travis.yml to tox.ini to ensure local testing before commiting * Cleanup pdfinterp.py and add documentation from PDF Reference * Cleanup pdfpage.py * Cleanup pdffont.py * Clean psparser.py * Cleanup high_level.py * Cleanup layout.py * Cleanup pdfparser.py * Cleanup pdfcolor.py * Cleanup rijndael.py * Cleanup converter.py * Rename klass to cls if it is the class variable, to be more consistent with standard practice * Cleanup cmap.py * Cleanup pdfdevice.py * flake8 ignore fontmetrics.py * Cleanup test_pdfminer_psparser.py * Fix flake8 in pdfdocument.py; 339 errors to go * Fix flake8 utils.py; 326 errors togo * pep8 correction for few files in /tools/ 328 > 160 to go (#342) * pep8 correction for few files in /tools/ 328 > 160 to go * pep8 correction: 160 > 5 to go * Fix ascii85.py errors * Fix error in getting index from target that does not exists * Remove commented print lines * Fix flake8 error in pdfinterp.py * Fix python2 specific error by removing argument from print statement * Ignore invalid python2 syntax * Update contributing.md * Added changelog * Remove unused import Co-authored-by: Fakabbir Amin <f4amin@gmail.com> 2019-12-29 20:20:20 +00:00			`import six`

Python 3.4 compatibility + tests 2014-09-04 07:36:19 +00:00
Added RunLengthDecode filter by Troy Bollinger. git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@167 1aa58f4a-7d42-0410-adbc-911cccaed67c 2009-12-24 11:51:43 +00:00			`def rldecode(data):`
			`"""`
			`RunLength decoder (Adobe version) implementation based on PDF Reference`
			`version 1.4 section 3.3.4:`
			`The RunLengthDecode filter decodes data that has been encoded in a`
			`simple byte-oriented format based on run length. The encoded data`
			`is a sequence of runs, where each run consists of a length byte`
			`followed by 1 to 128 bytes of data. If the length byte is in the`
			`range 0 to 127, the following length + 1 (1 to 128) bytes are`
			`copied literally during decompression. If length is in the range`
			`129 to 255, the following single byte is to be copied 257 - length`
			`(2 to 128) times during decompression. A length value of 128`
			`denotes EOD.`
			`"""`
Python 3.4 compatibility + tests 2014-09-04 07:36:19 +00:00			`decoded = b''`
PEP8: Whitespace changes to match pep8 2013-11-07 08:35:04 +00:00			`i = 0`
Added RunLengthDecode filter by Troy Bollinger. git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@167 1aa58f4a-7d42-0410-adbc-911cccaed67c 2009-12-24 11:51:43 +00:00			`while i < len(data):`
Enforce pep8 coding-style (#345) * Code Refractor: Use code-style enforcement #312 * Add flake8 to travis-ci * Remove python 2 3 comment on six library. 891 errors > 870 errors. * Remove class and functions comments that consist of just the name. 870 errors > 855 errors. * Fix flake8 errors in pdftypes.py. 855 errors > 833 errors. * Moving flake8 testing from .travis.yml to tox.ini to ensure local testing before commiting * Cleanup pdfinterp.py and add documentation from PDF Reference * Cleanup pdfpage.py * Cleanup pdffont.py * Clean psparser.py * Cleanup high_level.py * Cleanup layout.py * Cleanup pdfparser.py * Cleanup pdfcolor.py * Cleanup rijndael.py * Cleanup converter.py * Rename klass to cls if it is the class variable, to be more consistent with standard practice * Cleanup cmap.py * Cleanup pdfdevice.py * flake8 ignore fontmetrics.py * Cleanup test_pdfminer_psparser.py * Fix flake8 in pdfdocument.py; 339 errors to go * Fix flake8 utils.py; 326 errors togo * pep8 correction for few files in /tools/ 328 > 160 to go (#342) * pep8 correction for few files in /tools/ 328 > 160 to go * pep8 correction: 160 > 5 to go * Fix ascii85.py errors * Fix error in getting index from target that does not exists * Remove commented print lines * Fix flake8 error in pdfinterp.py * Fix python2 specific error by removing argument from print statement * Ignore invalid python2 syntax * Update contributing.md * Added changelog * Remove unused import Co-authored-by: Fakabbir Amin <f4amin@gmail.com> 2019-12-29 20:20:20 +00:00			`length = six.indexbytes(data, i)`
Added RunLengthDecode filter by Troy Bollinger. git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@167 1aa58f4a-7d42-0410-adbc-911cccaed67c 2009-12-24 11:51:43 +00:00			`if length == 128:`
			`break`
			`if length >= 0 and length < 128:`
Enforce pep8 coding-style (#345) * Code Refractor: Use code-style enforcement #312 * Add flake8 to travis-ci * Remove python 2 3 comment on six library. 891 errors > 870 errors. * Remove class and functions comments that consist of just the name. 870 errors > 855 errors. * Fix flake8 errors in pdftypes.py. 855 errors > 833 errors. * Moving flake8 testing from .travis.yml to tox.ini to ensure local testing before commiting * Cleanup pdfinterp.py and add documentation from PDF Reference * Cleanup pdfpage.py * Cleanup pdffont.py * Clean psparser.py * Cleanup high_level.py * Cleanup layout.py * Cleanup pdfparser.py * Cleanup pdfcolor.py * Cleanup rijndael.py * Cleanup converter.py * Rename klass to cls if it is the class variable, to be more consistent with standard practice * Cleanup cmap.py * Cleanup pdfdevice.py * flake8 ignore fontmetrics.py * Cleanup test_pdfminer_psparser.py * Fix flake8 in pdfdocument.py; 339 errors to go * Fix flake8 utils.py; 326 errors togo * pep8 correction for few files in /tools/ 328 > 160 to go (#342) * pep8 correction for few files in /tools/ 328 > 160 to go * pep8 correction: 160 > 5 to go * Fix ascii85.py errors * Fix error in getting index from target that does not exists * Remove commented print lines * Fix flake8 error in pdfinterp.py * Fix python2 specific error by removing argument from print statement * Ignore invalid python2 syntax * Update contributing.md * Added changelog * Remove unused import Co-authored-by: Fakabbir Amin <f4amin@gmail.com> 2019-12-29 20:20:20 +00:00			`for j in range(i+1, (i+1)+(length+1)):`
			`decoded += six.int2byte(six.indexbytes(data, j))`
Added RunLengthDecode filter by Troy Bollinger. git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@167 1aa58f4a-7d42-0410-adbc-911cccaed67c 2009-12-24 11:51:43 +00:00			`i = (i+1) + (length+1)`
			`if length > 128:`
Enforce pep8 coding-style (#345) * Code Refractor: Use code-style enforcement #312 * Add flake8 to travis-ci * Remove python 2 3 comment on six library. 891 errors > 870 errors. * Remove class and functions comments that consist of just the name. 870 errors > 855 errors. * Fix flake8 errors in pdftypes.py. 855 errors > 833 errors. * Moving flake8 testing from .travis.yml to tox.ini to ensure local testing before commiting * Cleanup pdfinterp.py and add documentation from PDF Reference * Cleanup pdfpage.py * Cleanup pdffont.py * Clean psparser.py * Cleanup high_level.py * Cleanup layout.py * Cleanup pdfparser.py * Cleanup pdfcolor.py * Cleanup rijndael.py * Cleanup converter.py * Rename klass to cls if it is the class variable, to be more consistent with standard practice * Cleanup cmap.py * Cleanup pdfdevice.py * flake8 ignore fontmetrics.py * Cleanup test_pdfminer_psparser.py * Fix flake8 in pdfdocument.py; 339 errors to go * Fix flake8 utils.py; 326 errors togo * pep8 correction for few files in /tools/ 328 > 160 to go (#342) * pep8 correction for few files in /tools/ 328 > 160 to go * pep8 correction: 160 > 5 to go * Fix ascii85.py errors * Fix error in getting index from target that does not exists * Remove commented print lines * Fix flake8 error in pdfinterp.py * Fix python2 specific error by removing argument from print statement * Ignore invalid python2 syntax * Update contributing.md * Added changelog * Remove unused import Co-authored-by: Fakabbir Amin <f4amin@gmail.com> 2019-12-29 20:20:20 +00:00			`run = six.int2byte(six.indexbytes(data, i+1))*(257-length)`
			`decoded += run`
Added RunLengthDecode filter by Troy Bollinger. git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@167 1aa58f4a-7d42-0410-adbc-911cccaed67c 2009-12-24 11:51:43 +00:00			`i = (i+1) + 1`
Python 3.4 compatibility + tests 2014-09-04 07:36:19 +00:00			`return decoded`