1. Preface
New Year's Day is no different for me than usual, what to do and what to do.
I haven't worked with Frida before and spent some time learning it the other day. Compared to the xposed hook framework, Frida is really convenient for debugging. There are also some Frida kill scripts (also called spit-out algorithm scripts) on the Internet now, but they are generally constructed in iv vectors and key constructions are hooked separately, which results in the final output not being a whole, encrypted and decrypted data, iv vectors, keys, output not in the same block. I don't want to take it from the Internet (I always feel comfortable writing it once, since it's not too complicated and I can get familiar with frida), so I want to make a kill script that outputs more than one message in the same algorithm, and then I write a software using C++ Qt to view the recorded data.
2. What is a hook?
This question reminds me of when I was in college, I used Linux mint system and there was no QQ on linux system, so I used the QQ software encapsulated in deepin-win. But I found a bug in the process of using it, that is, I could not open the received file or folder. I guess this problem was caused by Mint not having a corresponding file manager, because I used mint and the software was using the system on deepin. So I set up a command script on the mint system with the same name as the file manager on the deepin system, and then this command script called the mint local file manager to open the corresponding folder or file, which solved the problem.
This works like a hook, except I don't intercept the delivery of the message (because there is no command to receive the message at all). It is common to intercept the message before processing it. At that time, I resolved the bug on deepin-wine QQ and felt that I was doing a great job. Now I just want to think that I knew too little about it at that time, which is a reflection of ignorance.
3. Scripting principle of kill algorithm (spit-up algorithm)
When calling AES encryption and decryption on Android, MD5 digest algorithm is a class that needs to call the base, so as long as hook is the base class, the algorithm will be executed whenever and wherever it is called. Unless it's an encryption algorithm implemented by the app itself, you can only analyze the code inside the app.
For example, an AES-encrypted call example:
public static byte[] aes_enc(byte[] bytesContent, String key) throws Exception { byte[] raw = key.getBytes("utf-8"); SecretKeySpec skeySpec = new SecretKeySpec(raw, "AES"); Cipher cipher = Cipher.getInstance("AES/ECB/PKCS5Padding"); cipher.init(Cipher.ENCRYPT_MODE, skeySpec); byte[] enc = cipher.doFinal(bytesContent); return enc; }
You can see here that there is a key class Cipher, which requires only the hook doFinal() function to get the cipher and clear text, and the secret key can be obtained through the hook SecretKeySpec() class constructor.
All these classes are implemented in jce. Implemented in jar, in JDK.
Cipher class:
SecretKeySpec class
4. Analysis
When hook ing the constructors or ordinary functions of these classes, we encounter a lot of overloads. Here is a simple explanation of the calling relationships of these overloads.
Cipher class
// getInstance function // 2.overload('java.lang.String', 'java.lang.String') -> 3 // 1.overload('java.lang.String') | // 3.overload('java.lang.String', 'java.security.Provider') | // init function // 1.overload('int', 'java.security.Key') -> 4 // 2.overload('int', 'java.security.cert.Certificate') -> 6 // 3.overload('int', 'java.security.Key', 'java.security.AlgorithmParameters') -> 7 // 5.overload('int', 'java.security.Key', 'java.security.spec.AlgorithmParameterSpec') ->8 // 6.overload('int', 'java.security.cert.Certificate', 'java.security.SecureRandom') | // 4.overload('int', 'java.security.Key', 'java.security.SecureRandom') | // 7.overload('int', 'java.security.Key', 'java.security.AlgorithmParameters', 'java.security.SecureRandom') | // 8.overload('int', 'java.security.Key', 'java.security.spec.AlgorithmParameterSpec', 'java.security.SecureRandom') | // doFinal function // 1.overload() | // 2.overload('[B') | // 3.overload('java.nio.ByteBuffer', 'java.nio.ByteBuffer') | // 4.overload('[B', 'int') | // 5.overload('[B', 'int', 'int') | // 6.overload('[B', 'int', 'int', '[B') | // 7.overload('[B', 'int', 'int', '[B', 'int') |
The numbers here indicate the number entry for the overloaded function, -> indicates the call, | indicates that no other overloaded function is called. In this way, you can clearly see the call relationship between overloaded functions.
So why do I need to clear up the overload relationship? This lets you know which hook functions are necessary. If we don't know the call relationship between overloaded functions, hook all the overloads of a function directly:
var cipher = Java.use("javax.crypto.Cipher"); // Encryption type // 2.overload('java.lang.String', 'java.lang.String') -> 3 // 1.overload('java.lang.String') | // 3.overload('java.lang.String', 'java.security.Provider') | for (let index = 0; index < cipher.getInstance.overloads.length; index++) { cipher.getInstance.overloads[index].implementation = function () { console.log("type:" + JSON.stringify(arguments[0])); console.log(JSON.stringify(this)); return this.getInstance.apply(this, arguments); } }
In this way, you may find that the output type is: twice, because 1,2,3 functions are hooked, app may only call 2, and 2 itself calls 3. So it will output twice, so for this getInstance function, only hook 1, 3 overloaded functions are needed to listen for all calls.
5. Deep analysis
By doing so, you can listen for parameters that call functions and reduce hooks for unnecessary functions. However, it is still not possible to output data in a single block, encryption type output and clear text, ciphertext may be distributed output. To achieve the functions described in my preface, you must hook a function to get the secret key, iv vector, plain text, cipher text, mode, encryption or decryption within a function, and then output the information. In fact, this must be passed by the object itself, that is, to view the binding on the object properties.
Pattern: The pattern here refers to the string "AES/ECB/PKCS5Padding", so start with the getInstance function. Of course, all you need to do here is to analyze the function overload that will definitely be called, that is, the function with | at the end:
As you can see, getInstance is a function that returns an instantiated object of a cipher class, and this string is passed to it in the past, so follow up and analyze its constructor further:
You can see these constructors, passing the paramString parameter exclusively to this.transformation, then this pattern is available through this
Encryption or decryption: Encryption or decryption is the first parameter of the init function, so here the reinitit function begins to analyze the state of the parameter's transfer, also by analyzing the function that must be called:
All four overloaded functions are paramInt functions that are passed to this.opmode.
IV vector, this is a bit difficult to find, let's not say much about the specific process, you can pass this.spi.engineGetIV() or this.getIV() obtained, actually this.getIV() also calls this.spi.engineGetIV() obtained.
Ciphertext and plain text: Because I want to hook a function, I think the hook is the last doFinal function, so I can get the ciphertext and plain text, and then through this, I get the mode, encryption or decryption, iv vector mentioned above.
Secret key: Only this parameter is one I did not get through this, so I hook ed the init function, and the second parameter is the secret key.
6. Problems and Solutions
Since I can get all but the secret key through this, I save it in JS with a global variable and output it in my hook's doFinal function, but here's a problem. Multiple threads may encrypt and decrypt it at the same time. The key may not be the key used by the object's doFinal function at that time. So here I need to bind the unique ID of the instantiated object to the key, and then get the key from the unique ID of the object in the doFinal function for output.
The solution here is that I made a dictionary (python is a dictionary, js is something I forgot), where the key is the unique ID of the object and the value is the secret key. This ensures the accuracy of the corresponding relationship.
7. python calls and data saving
hookCalc.js
var allKeys = {}; Java.perform(function () { var cipher = Java.use("javax.crypto.Cipher"); for (let index = 0; index < cipher.init.overloads.length; index++) { cipher.init.overloads[index].implementation = function () { allKeys[this.toString()] = arguments[1].getEncoded(); this.init.apply(this, arguments); } } for (let index = 0; index < cipher.doFinal.overloads.length; index++) { cipher.doFinal.overloads[index].implementation = function () { var dict = {}; dict["EorD"] = this.opmode.value; //Mode Encryption Decryption dict["method"] = this.transformation.value; //Encryption type var iv = this.spi.value.engineGetIV(); if (iv){ dict["iv"] = iv; }else{ dict["iv"] = ""; } if (allKeys[this.toString()]){ dict["password"] = allKeys[this.toString()] }else{ dict["password"] = ""; } var retVal = this.doFinal.apply(this, arguments); dict["receData"] = ""; dict["resData"] = ""; if (arguments.length >= 1 && arguments[0].$className != "java.nio.ByteBuffer") { dict['receData'] = arguments[0]; dict["resData"] = retVal; } send(dict); return retVal; } } })
main.py
import frida import sys import sqlite3 import hashlib index = 0 db = "me.db" def md5(data): hl = hashlib.md5() hl.update(data) return hl.hexdigest() def createDB(): sql = ''' CREATE TABLE IF NOT EXISTS "record" ( "id" TEXT NOT NULL, "method" INTEGER, "EorD" TEXT, "password" BLOB, "iv" BLOB, "receData" BLOB, "resData" BLOB, PRIMARY KEY("id") ); ''' conn = sqlite3.connect(db) cursor = conn.cursor() cursor.execute(sql) conn.commit() conn.close() def message(message,arg2): try: global index conn = sqlite3.connect(db) cursor = conn.cursor() if message['type'] == "send": data = message['payload'] method = data["method"] EorD = data["EorD"] password = bytes([i if i>=0 else 256+i for i in data["password"]]) iv = bytes([i if i>=0 else 256+i for i in data["iv"]]) receData = bytes([i if i>=0 else 256+i for i in data["receData"]]) resData = bytes([i if i>=0 else 256+i for i in data["resData"]]) id_md5 = md5((method+str(EorD)).encode()+password+iv+receData+resData) sql = "insert into record(id,method,EorD,password,iv,receData,resData) values (?,?,?,?,?,?,?)" cursor.execute(sql,(id_md5,method,EorD,sqlite3.Binary(password),sqlite3.Binary(iv),sqlite3.Binary(receData),sqlite3.Binary(resData))) conn.commit() print(index) index += 1 except Exception as e: print(e) pass with open("hookCalc.js",encoding='utf8') as f: js = f.read() process = frida.get_remote_device().attach("APP Name, not package name") script = process.create_script(js) script.on("message",message) script.load() createDB() print(process) sys.stdin.read()
The data is not easy to preview because it is large and non-output (binary data output chaos). Although JS has functions to convert encoding and data format, I am still used to python. I saved it to a sqlite3 database, where I made md5 to ensure uniqueness of data saving, and then I wrote a simple software preview of the data using C++ QT.
C++ Preview Data Software
The software reads the me of the sibling directory. DB database.
Other
The script is mainly for AES algorithm, and should also be able to capture RSA. As for md5, it can expand itself, because there are many parameters of AES algorithm, there will be the problem that the output will be that block, MD5 will not have this problem at all.