[Python crawler advanced learning] - JS reverse hundred examples - Youdao translation interface parameter reverse

 

Reverse target

Reverse process

Packet capture analysis

We can enter text on Youdao translation page casually. We can see that the translation result comes out without refreshing the page. It can be inferred that it was loaded by Ajax. Open the developer tool and select XHR to filter Ajax requests. You can see that there is a URL of https://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule When we enter "test", the returned data is similar to the following structure:

{
	"translateResult": [
		[{
			"tgt": "test",
			"src": "test"
		}]
	],
	"errorCode": 0,
	"type": "zh-CHS2en",
	"smartResult": {
		"entries": ["", "[test] test\r\n", "measurement\r\n"],
		"type": 1
	}
}

translateResult is the result of translation and smartResult is other translations recommended intelligently, so this URL is the translation interface we need.

Since it is a POST request, we observe its Form Data:

  • i: String to be translated;
  • from: language to be translated;
  • to: target language;
  • lts: timestamp;
  • smartresult, client, doctype, version, keyfrom: fixed value;
  • action: real time translation} FY_BY_REALTlME, manually click translate} FY_BY_CLICKBUTTION;
  • The values of salt, sign and bv , will change every time and need further analysis.

Parameter inversion

If you search for any of the three encryption parameters salt, sign and bv , globally, there will be more search results. If you compare them in turn, you can find Fanyi Starting from line 8969 of min.js file, all parameters of Form Data are complete. Bury the breakpoint for debugging, and you can see that all data are consistent with the final result. The four encrypted parameters take values in , R , track , R, and look up to see , r = v.generateSaltSign(n);, Where n is the input string to be translated:

 

Continue to follow up the # generateSaltSign # function, click to jump to # function, where you can see the key encryption code:

var r = function(e) {
    var t = n.md5(navigator.appVersion)
      , r = "" + (new Date).getTime()
      , i = r + parseInt(10 * Math.random(), 10);
    return {
        ts: r,
        bv: t,
        salt: i,
        sign: n.md5("fanyideskweb" + e + i + "Y2FYu%TNSbMCxc3t2u^XT")
    }
};

Analyze this key encryption code:

  • navigator.appVersion is the UserAgent

  • The value of bv , is encrypted by UserAgent through MD5

  • The value of ts is a 13 bit timestamp

  • The value of salt , is obtained by adding the value of , ts , to a random integer of 0-9

  • The value of sign , consists of the string to be translated, the value of salt , and two other fixed strings, and then the final result is encrypted by MD5

This process is relatively simple. You can directly use Python to reproduce:

import time
import random
import hashlib


query = "String to be translated"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

lts = str(int(time.time() * 1000))                                # 13 bit timestamp in milliseconds
salt = lts + str(random.randint(0, 9))                            # 13 bit timestamp + random number to generate salt value
sign = "fanyideskweb" + query + salt + "Y2FYu%TNSbMCxc3t2u^XT"    # Concatenate strings to form a sign
sign = hashlib.md5(sign.encode()).hexdigest()                     # MD5 encrypt the sign to generate the final sign value
bv = hashlib.md5(user_agent.encode()).hexdigest()                 # MD5 encryption is performed on UA to generate bv value

Or directly reference JS and use the encryption module CryptoJS in nodejs for MD5 encryption. Rewrite JS as follows:

// Reference crypto JS encryption module
var CryptoJS = require('crypto-js')

function getEncryptedParams(data, ua) {
    var bv = CryptoJS.MD5(ua).toString()
        , lts = "" + (new Date).getTime()
        , salt = lts + parseInt(10 * Math.random(), 10)
    var sign = CryptoJS.MD5('fanyideskweb'+data+salt+']BjuETDhU)zqSxf-=B#7m').toString()
    return {bv: bv, lts: lts, salt: salt, sign: sign}
}

Complete code

youdao_encrypt.js

Get encryption parameters {salt, sign, bv:

// Reference crypto JS encryption module
var CryptoJS = require('crypto-js')

function getEncryptedParams(data, ua) {
    var bv = CryptoJS.MD5(ua).toString(),
        lts = "" + (new Date).getTime(),
        salt = lts + parseInt(10 * Math.random(), 10)
    var sign = CryptoJS.MD5('fanyideskweb' + data + salt + ']BjuETDhU)zqSxf-=B#7m').toString()
    return { bv: bv, lts: lts, salt: salt, sign: sign }
}

// var ua = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
// var data = "test"
// console.log(getEncryptedParams(data, ua));

youdaofanyi.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-


import time
import random
import hashlib

import execjs
import requests


translate_url = 'https://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'


def get_translation_result(parameters):
    headers = {
        'User-Agent': user_agent,
        'Host': 'fanyi.youdao.com',
        'Origin': 'https://fanyi.youdao.com',
        'Referer': 'https://fanyi.youdao.com/',
        'X-Requested-With': 'XMLHttpRequest',
        'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
        'Cookie': 'OUTFOX_SEARCH_USER_ID="-1848382357@10.169.0.84"; ___rl__test__cookies=1625907853887; OUTFOX_SEARCH_USER_ID_NCOO=132978720.55854891'
    }
    response = requests.post(url=translate_url, headers=headers, data=parameters)
    result = response.json()['translateResult'][0][0]['tgt']
    return result


def get_parameters_by_python(query, translate_from, translate_to):
    lts = str(int(time.time() * 1000))                                # 13 bit timestamp in milliseconds
    salt = lts + str(random.randint(0, 9))                            # 13 bit timestamp + random number to generate salt value
    sign = "fanyideskweb" + query + salt + "Y2FYu%TNSbMCxc3t2u^XT"    # Concatenate strings to form a sign
    sign = hashlib.md5(sign.encode()).hexdigest()                     # MD5 encrypt the sign to generate the final sign value
    bv = hashlib.md5(user_agent.encode()).hexdigest()                 # MD5 encryption is performed on UA to generate bv value
    parameters = {
        'i': query,
        'from': translate_from,
        'to': translate_to,
        'smartresult': 'dict',
        'client': 'fanyideskweb',
        'salt': salt,
        'sign': sign,
        'lts': lts,
        'bv': bv,
        'doctype': 'json',
        'version': '2.1',
        'keyfrom': 'fanyi.web',
        'action': 'FY_BY_REALTlME'
    }
    return parameters


def get_parameters_by_javascript(query, translate_from, translate_to):
    with open('youdao_encrypt.js', 'r', encoding='utf-8') as f:
        youdao_js = f.read()
    params = execjs.compile(youdao_js).call('get_params', query, user_agent)    # Get various parameters through JavaScript code
    bv = hashlib.md5(user_agent.encode()).hexdigest()                           # MD5 encryption is performed on UA to generate bv value
    parameters = {
        'i': query,
        'from': translate_from,
        'to': translate_to,
        'smartresult': 'dict',
        'client': 'fanyideskweb',
        'salt': params['salt'],
        'sign': params['sign'],
        'lts': params['lts'],
        'bv': bv,
        'doctype': 'json',
        'version': '2.1',
        'keyfrom': 'fanyi.web',
        'action': 'FY_BY_REALTlME'
    }
    return parameters


def main():
    query = input('Please enter the text to be translated:')
    # Original language, target language, default automatic processing
    translate_from = translate_to = 'AUTO'
    # Obtain encryption parameters through Python or JavaScript, choose one from the other
    param = get_parameters_by_python(query, translate_from, translate_to)
    # param = get_parameters_by_javascript(query, translate_from, translate_to)
    result = get_translation_result(param)
    print('The result of translation is:', result)


if __name__ == '__main__':
    main()

 

Finally, there is a surprise (don't miss it)

It is the dream of every programmer to become a big manufacturer. He also hopes to have the opportunity to shine and make great achievements. However, the distance between ideal and reality needs to be shortened.

So here I have prepared some gift bags, hoping to help you.


★ gift bag 1

If you have no self-control or motivation to learn and communicate together, please leave a message in the private letter or comment area. I will pull you into the learning and exchange group. We will exchange and study together, report to the group and punch in. There are many benefits in the group, waiting for you to unlock. Join us quickly!
★ gift bag 2

❶ a complete set of Python e-books, 200, a total of 6 G e-book materials, covering all major fields of Python.

❷ Python hands-on projects, including crawler, data analysis, machine learning, artificial intelligence and small game development.

 

Keywords: Python Javascript crawler

Added by Cloud9247 on Thu, 20 Jan 2022 18:42:35 +0200