peepdf crashes with a TypeError if some PDFs are analyzed in force parsing mode and PDFObjectStream.resolveReferences() is invoked.
Traceback (most recent call last):
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/main.py", line 409, in main
ret, pdf = pdfParser.parse(fileName, options.isForceMode, options.isLooseMode, options.isManualAnalysis)
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/PDFCore.py", line 7098, in parse
ret = body.updateObjects()
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/PDFCore.py", line 4288, in updateObjects
object.resolveReferences()
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/PDFCore.py", line 3253, in resolveReferences
ret = PDFParser.readObject(objectsSection[offset:])
TypeError: slice indices must be integers or None or have an __index__ method
If I fix that TypeError by converting offset at PDFCore.py:3243 to an int object I get another one:
Traceback (most recent call last):
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/main.py", line 409, in main
ret, pdf = pdfParser.parse(fileName, options.isForceMode, options.isLooseMode, options.isManualAnalysis)
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/PDFCore.py", line 7098, in parse
ret = body.updateObjects()
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/PDFCore.py", line 4288, in updateObjects
object.resolveReferences()
File "/home/sdeiss/Developer/bin/virtualenv/peekaboo/local/lib/python2.7/site-packages/peepdf/PDFCore.py", line 3253, in resolveReferences
ret = PDFParser.readObject(objectsSection[offset:])
TypeError: unbound method readObject() must be called with PDFParser instance as first argument (got str instance instead)
A possible solution would be to supply the PDFParser object to PDFObjectStream when creating that instance and then provide the supplied PDFParser instance for readObject().
peepdf crashes with a
TypeErrorif some PDFs are analyzed in force parsing mode andPDFObjectStream.resolveReferences()is invoked.If I fix that
TypeErrorby convertingoffsetatPDFCore.py:3243to anintobject I get another one:A possible solution would be to supply the
PDFParserobject toPDFObjectStreamwhen creating that instance and then provide the suppliedPDFParserinstance forreadObject().