Handwritten JS engine to explain an assignment interview question

There is such an interview question:

let a = { n: 1};

a.x = a = { n: 2};

console.log(a.x);

Ask what the output is.

The output of this question is undefined, because the assignment is carried out from left to right, that is, assign {n: 2} to a.x and then to a.

Add a variable preA to reference a before assignment, and you can see:

This is the problem of operator priority, just as the addition, subtraction, multiplication and division operators are calculated according to priority from left to right Operator has higher priority than =

This article is not written to talk about operator priority, but to implement a JS engine to explain and execute this code.

How to implement JS engine?

Composition of JS engine

A JS engine consists of four parts: Parser, Interperter, JIT Compiler and Garbage Collector.

The source code is parsed into AST by Parser, that is, the structure of the object tree that can be processed by the computer, and then each node is recursively interpreted by the interpreter, which is the process of interpretation and execution.

However, the interpretation execution is relatively slow. In order to optimize the speed, JIT compiler is introduced, which will compile the frequently executed hot code into machine code for direct execution.

At the same time, the memory is limited. It is necessary to continuously cycle to clear the memory that is no longer used, that is, garbage collection, which is what GC does.

If we implement a simple JS engine ourselves, we can omit the JIT compiler and GC, which optimize time and space respectively and are not necessary.

That is, just implement Parser and Interpreter:

This is the structure of the JS engine we want to implement.

Analysis on the implementation of simple JS engine

The Parser can be any JS Parser. We can directly use @ babel/parser.

The AST generated by it can be used as AST explorer Net visualization:

How to explain execution?

Explain AST is to recursively explain each ast node. How to explain the specific ast node?

For example, the object expression of {n:1} corresponds to the ObjectExpression node, which has child nodes of ObjectProperty property property and key and value properties.

Naturally, it is conceivable that interpreting the ObjectExpression node is to take out the data in the AST and construct an object to return:

For another example, let a = {n: 1} is an assignment statement, which corresponds to the VariableDeclaration node. There are multiple VariableDeclarator child nodes below. This is because a declaration statement can declare multiple variables, such as let a = 1, b =1; The specific declaration variable declarator has id and init parts respectively.

The init part is ObjectExpression, which explains that it is to construct an object return. Then, interpreting the whole declaration is naturally to put a variable with the name of id node and the name of value in the scope. The value is the result of the interpretation execution of init node.

Identifier means identifier, that is, a here.

This is how the object expression ObjectExpression and the declaration statement VariableDeclaration are interpreted and executed. It's not difficult.

The interpretation of other AST nodes is the same. Each node is interpreted recursively, which is the implementation principle of explaining and executing JS code.

When we know what to do, let's write code to realize it:

Interpretation and execution of declaration statements

We first implement the interpretation and execution of declaration statements, and then implement the interpretation and execution of assignment statements:

Parser uses babel parser, uses its parse method to convert the source code into ast, and then explains the execution of AST:

const parser = require('@babel/parser');

function eval() {
    const ast = parser.parse(code);
    evaluate(ast.program);
}

const scope = new Map();

function evaluate(node) {

}

const code = `
let a = { dong: 111};
let b = { guang: 222};
let c = b;
`
eval(code);

console.log(scope);

We declare a Map object as the global scope.

The next step is to implement the interpreter, that is, the evaluate method.

As we analyzed earlier, interpretation execution is to recursively process each AST. The processing methods of each AST are different:

const astInterpreters = {
    Program(node) {
        node.body.forEach(item => {
            evaluate(item);
        })
    },
    VariableDeclaration(node) {
    },
    VariableDeclarator(node) {
    },
    ObjectExpression(node) {
    },
    Identifier(node) {
        return node.name;
    },
    NumericLiteral(node) {
        return node.value;
    }
}

function evaluate(node) {
    try {
        return astInterpreters[node.type](node);
    } catch(e) {
        console.error('Node type not supported:', e);
    }
}

We have implemented several simple AST interpretations: the Program root node, whose interpretation and execution is to interpret and execute each statement of the body. Identifier is the name attribute, and numeric literal is the value attribute.

Then there is the explanation of the object expression ObjectExpression, which is to construct an object return:

ObjectExpression(node) {
    const obj = {};
    node.properties.forEach(prop => {
        const key = evaluate(prop.key);
        const value = evaluate(prop.value);
        obj[key] = value;
    });
    return obj;
}

That is, take each node in the properties, get the interpretation and execution results of key and value, set them into the object, and finally return the object.

The explanation of the declaration statement is to set the following line in the scope (the Map we declare):

VariableDeclaration(node) {
    node.declarations.forEach((item) => {
        evaluate(item);
    });
},
VariableDeclarator(node) {
    const declareName = evaluate(node.id);
    if (scope.get(declareName)) {
        throw Error('duplicate declare variable: ' + declareName);
    } else {
        const valueNode = node.init;
        let value;
        if (valueNode.type === 'Identifier') {
            value = scope.get(valueNode.name);
        } else {
            value = evaluate(valueNode);
        }
        scope.set(declareName, value);
    }
},

VariableDeclaration is a declaration statement. Because there may be more than one specific declaration, it is necessary to execute each declaration in a loop.

The specific declaration of VariableDeclarator is to set the variable name and its value in the scope.

The variable name is node If the result of ID is declared, an error will be reported, because it can only be declared once.

Otherwise, take node Set the value of init to scope, that is, scope set(declarationName, value).

However, we should pay attention to the value processing. If it is an Identifier, which is actually a reference, such as variable a, we should first get its specific value from the scope.

After the explanation and execution logic of these nodes are written, we can explain this Code:

let a = { dong: 111};
let b = { guang: 222};
let c = b;

Three variables a, b and c are declared. The initial values of a and b are object literals, and c is referenced from b.

After execution, we print the scope:

Execution succeeded! We have implemented the simplest JS engine!

Of course, just declaration is not enough. Next, we will implement the interpretation of assignment statement:

Interpretation and execution of assignment statement

The explanation of assignment statement is to explain the AssignmentExpression node, using ast explorer Net to see its structure:

Why is it wrapped with an ExpressionStatement node?

Because the expression cannot be executed directly, and the statement is the basic unit of execution, it is OK for the expression to wrap a layer of expression statement.

AssignmentExpression has left and right attributes, which are the nodes corresponding to the left and right parts of = respectively.

What if right or AssignmentExpression?

Then you should continue to get right and know that you don't get the node of AssignmentExpression, which is the value to be assigned.

All nodes are assigned from left to right.

Therefore, the interpretation execution of the AssignmentExpression node is as follows:

If we want to look right, we should declare curNode to represent the current node, and then look right in the while loop. All the left in the process are put into an array:

let curNode = node;
const targetArr = [curNode.left];
while(curNode.right.type === 'AssignmentExpression') {
    curNode = curNode.right;
    targetArr.push(curNode.left);
}

The last right is the AST of the assigned value, which gets its value after interpretation and execution.

const value = evaluate(curNode.right);

Then assign value to all variables in targetArr, that is, assign values from left to right:

Here we need to distinguish the assignment of a and a.x:

If it is a, that is, Identifier, it is OK to set the scope of the channel, that is, scope set(varName, value).

If it is a.x, that is, MemberExpression, the part of the object to be obtained from the scope is scope Get (objname), and then set the property.

That is:

targetArr.forEach(target => {
    if (target.type === 'Identifier'){
        const varName = evaluate(target);
        scope.set(varName, value);
    } else if (target.type === 'MemberExpression') {
        const objName = evaluate(target.object);
        const obj = scope.get(objName);

        const propName = evaluate(target.property);
        obj[propName] = value;
    }  
})

The complete code for the interpretation and execution of the assignment statement is as follows:

AssignmentExpression(node) {
    let curNode = node;
    const targetArr = [curNode.left];
    while(curNode.right.type === 'AssignmentExpression') {
        curNode = curNode.right;
        targetArr.push(curNode.left);
    }
    const value = evaluate(curNode.right);

    targetArr.forEach(target => {
        if (target.type === 'Identifier'){
            const varName = evaluate(target);
            scope.set(varName, value);
        } else if (target.type === 'MemberExpression') {
            const objName = evaluate(target.object);
            const obj = scope.get(objName);

            const propName = evaluate(target.property);
            obj[propName] = value;
        }  
    })
}

Realizing the interpretation of declaration statement and assignment statement actually achieves our goal. We execute the following opening code:

let a = { n: 1};
let preA = a;
a.x = a = { n: 2};

Look at the values in the scope:

Why a.x is undefined is not explained clearly.

All codes are uploaded to GitHub: https://github.com/QuarkGluonPlasma/babel-article-code

summary

We did an interview question of assignment statement, which examined the execution of operators from left to right.

But just knowing how the assignment operator is executed is not enough. We have written a JS engine to execute it.

JS engine consists of Parser, Interpreter, JIT Compiler and Garbage Collector. JIT Compiler and GC are used to optimize time and space respectively, which is not necessary. Therefore, we have implemented a JS engine with only Parser and Interpreter.

Any JS parser is OK for Parser. We use babel parser. The implementation of the interpreter is to recursively interpret each node. We implement the interpretation and execution of declaration statements and assignment statements respectively.

Finally, we get the initial result and clearly know how the assignment statement interprets and executes.

(of course, because only these lines of code need to be explained, the interpreter is not perfect. For a more perfect interpreter, see the case of JS interpreter in the small volume of Babel plug-in customs clearance secrets.)

Added by srikanthiv on Thu, 03 Mar 2022 11:52:44 +0200

Programming VIP