NaviiAlto -

pdf-parser.py

2019. 1. 3. 14:10

This tool will parse a PDF document to identify the fundamental elements used in the analyzed file. It will not render a PDF document. The code of the parser is quick-and-dirty, I’m not recommending this as text book case for PDF parsers, but it gets the job done.

You can see the parser in action in this screencast.

The stats option display statistics of the objects found in the PDF document. Use this to identify PDF documents with unusual/unexpected objects, or to classify PDF documents. For example, I generated statistics for 2 malicious PDF files, and although they were very different in content and size, the statistics were identical, proving that they used the same attack vector and shared the same origin.

The search option searches for a string in indirect objects (not inside the stream of indirect objects). The search is not case-sensitive, and is susceptible to the obfuscation techniques I documented (as I’ve yet to encounter these obfuscation techniques in the wild, I decided no to resort to canonicalization).

filter option applies the filter(s) to the stream. For the moment, only FlateDecode is supported (e.g. zlib decompression).

The raw option makes pdf-parser output raw data (e.g. not the printable Python representation).

objects outputs the data of the indirect object which ID was specified. This ID is not version dependent. If more than one object have the same ID (disregarding the version), all these objects will be outputted.

reference allows you to select all objects referencing the specified indirect object. This ID is not version dependent.

type allows you to select all objects of a given type. The type is a Name and as such is case-sensitive and must start with a slash-character (/).

pdf-parser_V0_6_9.zip (https)

MD5: 27D65A96FEAF157360ACBBAAB9748D27

SHA256: 3F102595B9EAE5842A1B4723EF965344AE3AB01F90D85ECA96E9678A6C7092B7

- -f 옵션은 복호화를 하고 -w옵션은 개행을 처리해 화면에 보여준다.

>pdf-parser.py d:\test.pdf -o 3 -f -w

obj 3 0

Type:

Referencing: 4 0 R

Contains stream

/Filter /FlateDecode

/Length 4 0 R

50 0 0 50 0 50 cm

/Im1 Do

'PDF' 카테고리의 다른 글

pdfid.py (0)	2019.01.03
peepdf - PDF Analysis Tool (0)	2019.01.03
PDFDot PDF 분석 및 시각화 도구 (0)	2019.01.03
PDFStreamDumper:PDF분석도구 (0)	2019.01.03
PDF 분석 툴 모음 (0)	2019.01.03

PREV 1 2 3 4 5 6 7 ···42 NEXT

NaviiAlto -

pdf-parser.py

'PDF' 카테고리의 다른 글

+ Recent posts

티스토리툴바