brief introduction
When analyzing the JavaScript code of some sites, the simple code and functions are usually one by one, for example:
function a() {console.log("a")} function b() {console.log("a")} function c() {console.log("a")}
However, a slightly more complex site usually encounters a code structure similar to the following:
!function(i) { function n(t) { return i[t].call(a, b, c, d) } }([ function(t, e) {}, function(t, e, n) {}, function(t, e, r) {}, function(t, e, o) {} ]);
This writing method is very common in JavaScript and may be very simple for people familiar with JavaScript, but most crawler engineers write code in Python or Java. They may be confused to see this syntax. Because they often encounter it when stripping JS encrypted code, it is very important for crawler engineers to understand this syntax.
It seems that there is no official name for this writing method, which is equivalent to modular programming. Therefore, most people call it webpack. The above example looks laborious. Simply optimize it:
!function (allModule) { function useModule(whichModule) { allModule[whichModule].call(null, "hello world!"); } useModule(0) }([ function module0(param) {console.log("module0: " + param)}, function module1(param) {console.log("module1: " + param)}, function module2(param) {console.log("module2: " + param)}, ]);
Running the above code will output module0: hello world!, I believe that you can understand the general meaning through the simple variable name and function name useModule(0), select the first function from all functions, and hello world! Pass to module0 And output.
Carefully observe the above code, we will find that it is mainly used ! function(){}() and function.call() Grammar, then we will introduce them one by one.
Function declaration and function expression
In ECMAScript (a standard of JavaScript), there are two most commonly used methods for creating function objects, that is, using function declarations or function expressions. ECMAScript specification makes it clear that function declarations must always have an identifier, that is, the function name, and function expressions can be omitted.
Function declaration will assign a name to the function and will be loaded into the scope before code execution, so it is possible to call the function before or after the function declaration:
test("Hello World!") function test(arg) { console.log(arg) }
Function expression, create an anonymous function, and then assign the anonymous function to a variable, which will be defined only when the code executes the function expression, so the calling function can run correctly after the function expression, otherwise an error will be reported:
var test = function (arg) { console.log(arg) } test("Hello World!")
IIFE calls the function expression immediately
IIFE is fully called immediate invoked function expressions, which is translated into immediate call function expressions, also known as self executing functions, immediate executing functions, self executing anonymous functions, etc. IIFE is a syntax. In essence, this mode is that function expressions (named or anonymous) are executed immediately after creation. When a function becomes an immediate function expression, the variables in the expression cannot be accessed externally. IIFE is mainly used to isolate the scope and avoid pollution.
IIFE basic syntax
IIFE is written flexibly, mainly in the following formats:
1. Anonymous functions are preceded by unary operators and followed by ():
!function () { console.log("I AM IIFE") }(); -function () { console.log("I AM IIFE") }(); +function () { console.log("I AM IIFE") }(); ~function () { console.log("I AM IIFE") }();
2. Anonymous function followed by () and then reuse () Enclose the whole:
(function () { console.log("I AM IIFE") }());
3. First use () Enclose anonymous functions and add ():
(function () { console.log("I AM IIFE") })();
4. Using the arrow function expression, first use () Enclose the arrow function expression and add ():
(() => { console.log("I AM IIFE") })()
5. Anonymous function preceded by void Keyword, followed by (), void Specify an expression to evaluate or run without returning a value:
void function () { console.log("I AM IIFE") }();
Sometimes, we may see the case of immediately executing the semicolon before and after the function, for example:
;(function () { console.log("I AM IIFE") }()) ;!function () { console.log("I AM IIFE") }()
This is because the immediate execution function is usually used as a separate module, which is generally no problem. However, it is recommended to add a semicolon before or after the immediate execution function, so as to effectively isolate it from the previous or subsequent code, otherwise unexpected errors may occur.
IIFE parameter transfer
Place parameters at the end of the () Parameter transfer can be realized in:
var text = "I AM IIFE"; (function (param) { console.log(param) })(text); // I AM IIFE
var dict = {name: "Bob", age: "20"}; (function () { console.log(dict.name); })(dict); // Bob
var list = [1, 2, 3, 4, 5]; (function () { var sum = 0; for (var i = 0; i < list.length; i++) { sum += list[i]; } console.log(sum); })(list); // 15
Function.prototype.call() / apply() / bind()
Function.prototype.call(),Function.prototype.apply(),Function.prototype.bind() Are more commonly used methods. They act as like as two peas, changing the function. this The differences between them are as follows:
- call() Method will immediately execute the function and accept one or more parameters separated by commas;
- apply() Method will immediately execute this function and accept an array containing multiple parameters;
- bind() The method does not execute this function immediately, and returns a modified function to facilitate subsequent calls. call() Same.
call()
call() Method accepts multiple parameters. The first parameter thisArg specifies the point of this object in the function body. If the function is in non strict mode, it will be automatically replaced with pointing to the global object (window object in the browser) when it is specified as null or undefined. In strict mode, this in the function body is still null. Starting from the second parameter, each parameter is passed into the function in turn. The basic syntax is as follows:
function.call(thisArg, arg1, arg2, ...)
Example:
function test(a, b, c) { console.log(a + b + c) } test.call(null, 1, 2, 3) // 6
function test() { console.log(this.firstName + " " + this.lastName) } var data = {firstName: "John", lastName: "Doe"} test.call(data) // John Doe
apply()
apply() Method accepts two parameters, the first parameter thisArg and call() Methods are consistent. The second parameter is a set with subscripts. Starting from ECMAScript version 5, this set can be an array or a class array, apply() Method passes the elements in this collection as parameters to the called function. The basic syntax is as follows:
function.apply(thisArg, [arg1, arg2, ...])
Example:
function test(a, b, c) { console.log(a + b + c) } test.apply(null, [1, 2, 3]) // 6
function test() { console.log(this.firstName + " " + this.lastName) } var data = {firstName: "John", lastName: "Doe"} test.apply(data) // John Doe
bind()
bind() Methods and call() The accepted parameters are the same, except bind() The returned is a function. The basic syntax is as follows:
function.bind(thisArg, arg1, arg2, ...)
Example:
function test(a, b, c) { console.log(a + b + c) } test.bind(null, 1, 2, 3)() // 6
function test() { console.log(this.firstName + " " + this.lastName) } var data = {firstName: "John", lastName: "Doe"} test.bind(data)() // John Doe
Understanding webpack
With the above knowledge, let's understand modular programming, that is, the above-mentioned webpack writing method:
!function (allModule) { function useModule(whichModule) { allModule[whichModule].call(null, "hello world!"); } useModule(0) }([ function module0(param) {console.log("module0: " + param)}, function module1(param) {console.log("module1: " + param)}, function module2(param) {console.log("module2: " + param)}, ]);
First, the whole code is an IIFE immediate call function expression, and the passed parameter is an array containing three methods, namely module0,module1 and module2, which can be regarded as three modules, the parameters accepted by IIFE allModule It contains these three modules, and IIFE also contains a function useModule() can be regarded as a module loader, that is, which module to use, in the example useModule(0) That is to call the first module, which is used in the function call() Method to change the this Point to and pass parameters, and call the corresponding module for output.
Rewrite webpack
We can easily rewrite the webpack modular writing method often encountered in crawler reverse. Take a piece of encryption code as an example:
CryptoJS = require("crypto-js") !function (func) { function acvs() { var kk = func[1].call(null, 1e3); var data = { r: "I LOVE PYTHON", e: kk, i: "62bs819idl00oac2", k: "0123456789abcdef" } return func[0].call(data); } console.log("Encrypted text:" + acvs()) function odsc(account) { var cr = false; var regExp = /(^\d{7,8}$)|(^0\d{10,12}$)/; if (regExp.test(account)) { cr = true; } return cr; } function mkle(account) { var cr = false; var regExp = /^([a-zA-Z0-9_\.\-\+])+\@(([a-zA-Z0-9\-])+\.)+([a-zA-Z0-9]{2,4})+$/; if (regExp.test(account)) { cr = true; } return cr; } }([ function () { for (var n = "", t = 0; t < this.r.length; t++) { var o = this.e ^ this.r.charCodeAt(t); n += String.fromCharCode(o) } return encodeURIComponent(n) }, function (x) { return Math.ceil(x * Math.random()) }, function (e) { var a = CryptoJS.MD5(this.k); var c = CryptoJS.enc.Utf8.parse(a); var d = CryptoJS.AES.encrypt(e, c, { iv: this.i }); return d + "" }, function (e) { var b = CryptoJS.MD5(this.k); var d = CryptoJS.enc.Utf8.parse(b); var a = CryptoJS.AES.decrypt(e, d, { iv: this.i }).toString(CryptoJS.enc.Utf8); return a } ]);
You can see that the key encryption entry function is acvs(),acvs() The first and second functions in the IIFE parameter list are called, and other functions are interference items. The r and e parameters are used in the first function, which can be directly passed in. Finally, it is rewritten as follows:
function a(r, e) { for (var n = "", t = 0; t < r.length; t++) { var o = e ^ r.charCodeAt(t); n += String.fromCharCode(o) } return encodeURIComponent(n) } function b(x) { return Math.ceil(x * Math.random()) } function acvs() { var kk = b(1e3); var r = "I LOVE PYTHON"; return a(r, kk); } console.log("Encrypted text:" + acvs())
summary
After reading this article, you may think that webpack is just like this. It looks really simple, but in fact, when we analyze specific sites, it is often not as simple as the above examples. This article aims to make you briefly understand the principle of modular programming webpack. Brother K will lead you to analyze more complex webpacks in practice! Please pay attention!