Go compilation principle series 4 (syntax analysis)

preface

In the last article, I shared how the Go compiler parses source files into tokens. This article mainly shares how the syntax parsing stage performs syntax parsing according to different tokens. In this article, you can learn the following:

  1. Go parsing overview
  2. Go syntax parsing detailed process
  3. The example shows the complete process of syntax parsing

💡 Tips: the article will involve grammar and production. If you don't know much about it, please take a look at the previous Lexical analysis & the basis of grammatical analysis This article

The article is relatively long, but there are many codes, mainly for easy understanding. I believe you will gain something after reading it

Go parsing overview

In order to facilitate the later understanding, I provide a source file here. You can bring in the following contents according to the contents of the source file:

package main

import (
    "fmt"
    "go/token"
)

type aType string
const A = 666
var B = 888

func main() {
    fmt.Println("Test Parser")
    token.NewFileSet()
}

entrance

Introduced in the last article lexical analysis The compiler entry of Go is described in detail. The parser is initialized at the compiler entry, and the lexical analyzer is initialized during the initialization of the parser. Therefore, the lexical analyzer is embedded in the parser. We can find it in Src / CMD / compile / internal / syntax / parser The structure of the parser in Go is as follows:

type parser struct {
    file  *PosBase //Record the information of the open file (such as file name, row and column information)
    errh  ErrorHandler //Error callback
    mode  Mode //Parsing mode
    pragh PragmaHandler
    scanner //Lexical analyzer

    base   *PosBase // current position base
    first  error    // first error encountered
    errcnt int      // number of errors encountered
    pragma Pragma   // pragmas

    fnest  int    // function nesting level (for error handling)
    xnest  int    // Expression nesting level (for compact ambiguity resolution)
    indent []byte // tracing support
}

Because in Last The article has shared the location of the entry file and what it does in detail (src/cmd/compile/internal/syntax/syntax.go → Parse(...)), After the initialization of the parser and lexical analyzer is completed, we will see that it calls the fileOrNil() method of the parser, which is the core method of parsing. Next, we will introduce what this core method does

Go syntax parsing structure

This part will involve the parsing of each declaration type and the structure of each declaration type, as well as the relationship between the nodes of the syntax tree generated by the syntax parser. Some places are really difficult to understand. You can read the text roughly first, and then combined with the grammar analysis diagram below, it may be easier to understand

Grammar rules in Go parsing

I'll introduce this part as a whole first, and then go to the details

Enter the fileOrNil() method, and you will see such a line of comments in the comments

SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .

It is the grammar rule of Go parsing source files, which is in the second article of the series Fundamentals of lexical and grammatical analysis It is mentioned in part. After initializing the parser and lexical analyzer, the Go compiler will call this method. This method will continuously obtain tokens through the next() method provided by the lexical word splitter. The syntax analysis will be carried out according to the above grammar rules

You may not know the production formulas of PackageClause, ImportDecl and TopLevelDecl. You can find these three production formulas directly in this file (for production formulas, see Fundamentals of lexical and grammatical analysis (some are introduced)

SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .

PackageClause = "package" PackageName . 
PackageName = identifier . 

ImportDecl = "import" ( ImportSpec | "(" { ImportSpec ";" } ")" ) . 
ImportSpec = [ "." | PackageName ] ImportPath . 
ImportPath = string_lit .

TopLevelDecl = Declaration | FunctionDecl | MethodDecl . 
Declaration = ConstDecl | TypeDecl | VarDecl . 

ConstDecl = "const" ( ConstSpec | "(" { ConstSpec ";" } ")" ) . 
ConstSpec = IdentifierList [ [ Type ] "=" ExpressionList ] . 

TypeDecl = "type" ( TypeSpec | "(" { TypeSpec ";" } ")" ) . 
TypeSpec = AliasDecl | TypeDef . 
AliasDecl = identifier "=" Type . 
TypeDef = identifier Type . 

VarDecl = "var" ( VarSpec | "(" { VarSpec ";" } ")" ) . 
VarSpec = IdentifierList ( Type [ "=" ExpressionList ] | "=" ExpressionList ) .

fileOrNil() simply means parsing the source File according to the grammar SourceFile, and finally returns a syntax tree (File structure, which will be described below). As we know, the compiler will parse each File into a syntax tree

The following is a brief introduction to the meaning of several grammars in Go grammar analysis (if you read the previous one) Fundamentals of lexical and grammatical analysis This article should be easy to understand the meaning of Go (these Grammars)

SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .

We can see SourceFile By PackageClause,ImportDecl,TopLevelDecl These three non terminal symbols form. It means:
Each source file is mainly composed of package Declaration, import declaration, and top-level declaration (where ImportDecl and TopLevelDecl Is optional and optional, which is the meaning of square brackets). In other words, every source file should conform to this grammar rule

The first is the package declaration: PackageClause
PackageClause = "package" PackageName . 
PackageName = identifier . 
PackageClause Is a terminator package And a non terminator PackageName Constitute, and PackageName Consisting of an identifier

Therefore, when scanning the source file, the first thing you should get is package of Token,Then an identifier Token. Finished parsing package After the declaration, the import declaration should be followed

Then comes the import declaration: ImportDecl
ImportDecl = "import" ( ImportSpec | "(" { ImportSpec ";" } ")" ) . 
ImportSpec = [ "." | PackageName ] ImportPath . 
ImportPath = string_lit .

After understanding the meaning of the above grammar, let's look at how fileNil() parses the grammar according to the above grammar. But before that, you need to know what the node structure of the syntax tree generated by the fileOrNil() method is

Structure of each node of syntax tree

We know that the fileOrNil() method will generate a syntax tree according to the grammar rules of SourceFile, and the structure of each syntax tree is such a structure: (src/cmd/compile/internal/syntax/nodes.go → File)

type File struct {
    Pragma   Pragma
    PkgName  *Name
    DeclList []Decl
    Lines    uint
    node
}

It mainly contains the package name (PkgName) of the source file and all declarations (DeclList) in the source file. It should be noted that it will also parse import into DeclList as a declaration

fileOrNil() will parse all declarations (such as var, type, const) in the source file into DeclList according to the structure of each declaration (each declaration defines corresponding structures to store declaration information, and these structures are the child nodes of the syntax tree). In the grammar above, we can see the constant, type and variable declaration grammar, as well as the grammar of functions and methods, which can be found in Src / CMD / compile / internal / syntax / parser Found in go

FunctionDecl = "func" FunctionName ( Function | Signature ) .
FunctionName = identifier .
Function     = Signature FunctionBody .

MethodDecl   = "func" Receiver MethodName ( Function | Signature ) .
Receiver     = Parameters .

The root node structure of this syntax tree is the File structure, and its child node structure is:

ImportDecl struct {
        Group        *Group // nil means not part of a group
        Pragma       Pragma
        LocalPkgName *Name // including "."; nil means no rename present
        Path         *BasicLit
        decl
    }

ConstDecl struct {
        Group    *Group // nil means not part of a group
        Pragma   Pragma
        NameList []*Name
        Type     Expr // nil means no type
        Values   Expr // nil means no values
        decl
    }

TypeDecl struct {
        Group  *Group // nil means not part of a group
        Pragma Pragma
        Name   *Name
        Alias  bool
        Type   Expr
        decl
    }

VarDecl struct {
        Group    *Group // nil means not part of a group
        Pragma   Pragma
        NameList []*Name
        Type     Expr // nil means no type
        Values   Expr // nil means no values
        decl
    }

FuncDecl struct {
        Pragma Pragma
        Recv   *Field // nil means regular function
        Name   *Name
        Type   *FuncType
        Body   *BlockStmt // nil means no body (forward declaration)
        decl
    }

The structure of these nodes is defined in Src / CMD / compile / internal / syntax / nodes Go. That is, in the process of parsing, if it is parsed to meet the ImportDecl grammar rules, it will create nodes with corresponding structures to save relevant information; In case of meeting the var type declaration grammar, create the structure related to the var type declaration (VarDecl) to save the declaration information

If the function is parsed, it is a little more complex. From the structure of the function node, you can see that it includes the receiver, function name, type and function body. The most complex part is the function body, which is a BlockStmt structure:

BlockStmt struct {
        List   []Stmt //Stmt is an interface
        Rbrace Pos
        stmt
    }

BlockStmt is composed of a series of declarations and expressions. You can find it in Src / CMD / compile / internal / syntax / nodes You can see many expressions and declared structures in go (these structures are also the node structures under the function declaration)

// ----------------------------------------------------------------------------
// Statements
......
SendStmt struct {
        Chan, Value Expr // Chan <- Value
        simpleStmt
    }

    DeclStmt struct {
        DeclList []Decl
        stmt
    }

    AssignStmt struct {
        Op       Operator // 0 means no operation
        Lhs, Rhs Expr     // Rhs == ImplicitOne means Lhs++ (Op == Add) or Lhs-- (Op == Sub)
        simpleStmt
    }
......
ReturnStmt struct {
        Results Expr // nil means no explicit return values
        stmt
    }

    IfStmt struct {
        Init SimpleStmt
        Cond Expr
        Then *BlockStmt
        Else Stmt // either nil, *IfStmt, or *BlockStmt
        stmt
    }

    ForStmt struct {
        Init SimpleStmt // incl. *RangeClause
        Cond Expr
        Post SimpleStmt
        Body *BlockStmt
        stmt
    }
......

// ----------------------------------------------------------------------------
// Expressions
......
// [Len]Elem
    ArrayType struct {
        // TODO(gri) consider using Name{"..."} instead of nil (permits attaching of comments)
        Len  Expr // nil means Len is ...
        Elem Expr
        expr
    }

    // []Elem
    SliceType struct {
        Elem Expr
        expr
    }

    // ...Elem
    DotsType struct {
        Elem Expr
        expr
    }

    // struct { FieldList[0] TagList[0]; FieldList[1] TagList[1]; ... }
    StructType struct {
        FieldList []*Field
        TagList   []*BasicLit // i >= len(TagList) || TagList[i] == nil means no tag for field i
        expr
    }
......
// X[Index]
    IndexExpr struct {
        X     Expr
        Index Expr
        expr
    }

    // X[Index[0] : Index[1] : Index[2]]
    SliceExpr struct {
        X     Expr
        Index [3]Expr
        // Full indicates whether this is a simple or full slice expression.
        // In a valid AST, this is equivalent to Index[2] != nil.
        // TODO(mdempsky): This is only needed to report the "3-index
        // slice of string" error when Index[2] is missing.
        Full bool
        expr
    }

    // X.(Type)
    AssertExpr struct {
        X    Expr
        Type Expr
        expr
    }
......

Does it look familiar? There are if, for and return. Stmt in BlockStmt is an interface type, which means that various expression types or declaration type structures above can implement stmt interfaces

After knowing the relationship between the nodes in the syntax tree and the possible structures of these nodes, let's see how fileOrNil parses the nodes of the syntax tree step by step

Source code implementation of fileOrNil()

In this part, I will not go deep into each function to see how it is parsed, but introduce how it parses the token in the source file step by step from the general outline, because this part is mainly understood from the whole first. How to parse import, var, const and func will be described in detail in the next section

You can see that the code implementation of fileOrNil() mainly includes the following parts:

// SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .
func (p *parser) fileOrNil() *File {
    ......

  //1. Create a File structure
    f := new(File)
    f.pos = p.pos()

  // 2. First parse the package definition at the beginning of the file
    // PackageClause
    if !p.got(_Package) { //Check whether the package is defined in the first line
        p.syntaxError("package statement must be first")
        return nil
    }
    
    // 3. After parsing the package, parse the import declaration (each import is a declaration statement in the view of the parser)
    // { ImportDecl ";" }
    for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
    }

    // 4. switch according to the obtained token, select the corresponding branch to parse the corresponding type of statement
    // { TopLevelDecl ";" }
    for p.tok != _EOF {
        switch p.tok {
        case _Const:
            p.next() // Get the next token
            f.DeclList = p.appendGroup(f.DeclList, p.constDecl)

        ......
    }
    // p.tok == _EOF

    p.clearPragma()
    f.Lines = p.line

    return f
}

There are two important methods in fileOrNil(), which are the key to understanding what fileOrNil() is doing:

  • got
  • appendGroup

The first is the got function, whose parameter is a token, which is used to judge whether the token obtained from the lexical analyzer is the token passed in from the parameter

Then there is the appendGroup function, which has two parameters. The first is DeclList (described earlier when introducing the members of the File structure, which is used to store all declarations in the source File and is a slice type); The second parameter is a function, which is the analysis function of each type of declaration statement (for example, if I currently resolve to import declaration statement, I will pass the method of resolving import as the second parameter to appendGroup)

When parsing the import declaration statement, the following is the code:

for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
}

The function of appendGroup is to find out the definition of batch, such as the following

//Batch declaration of import
import (
    "fmt"
    "io"
    "strconv"
    "strings"
)

//Batch declaration of var
var (
    x int
    y int
)

For the declaration statement structure with batch declaration, it will have a Group field to indicate that these variables belong to the same Group, such as the declaration structure of import and the declaration structure of var

ImportDecl struct {
        Group        *Group // nil means not part of a group
        Pragma       Pragma
        LocalPkgName *Name // including "."; nil means no rename present
        Path         *BasicLit
        decl
}

VarDecl struct {
        Group    *Group // nil means not part of a group
        Pragma   Pragma
        NameList []*Name
        Type     Expr // nil means no type
        Values   Expr // nil means no values
        decl
}

In the appendGroup method, the parsing method of the corresponding declared type will be called. Like fileOrNil(), it will be parsed according to the grammar of the corresponding type declaration. For example, the method importDecl() that parses the import declaration

for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
}
......

// ImportSpec = [ "." | PackageName ] ImportPath .
// ImportPath = string_lit .
func (p *parser) importDecl(group *Group) Decl {
    if trace {
        defer p.trace("importDecl")()
    }

    d := new(ImportDecl)
    d.pos = p.pos()
    d.Group = group
    d.Pragma = p.takePragma()

    switch p.tok {
    case _Name:
        d.LocalPkgName = p.name()
    case _Dot:
        d.LocalPkgName = p.newName(".")
        p.next()
    }
    d.Path = p.oliteral()
    if d.Path == nil {
        p.syntaxError("missing import path")
        p.advance(_Semi, _Rparen)
        return nil
    }

    return d
}

We can see that it also creates the structure of the corresponding declaration (here is the import declaration), records the declaration information, and parses the subsequent contents according to the grammar of the declaration

Others include declaration statements such as const, type, var and func, which are matched and parsed by switch. They also have corresponding parsing methods (in fact, the code implemented according to their own grammar rules). I don't list them here. You can use Src / CMD / compile / internal / syntax / parser View in go

As we mentioned earlier, the parser will eventually use a different structure to build each node of the syntax tree. The root node is Src / CMD / compile / internal / syntax / nodes go → File. Its structure has been introduced earlier, mainly including package names and all declaration types. These different types of declarations are the child nodes of the syntax tree

It may be a little difficult to understand through the text description above. The following shows what the whole syntax parsing process is like through a diagram

Graphical parsing process

Take the source code provided at the beginning of the article as an example to show the process of syntax parsing. stay Go lexical analysis It is mentioned in this article that the compilation entry of go is from Src / CMD / compile / internal / GC / noder Go → parseFiles

Here, I believe you have an overall understanding of the grammar analysis part of Go. However, the above does not show how various declaration statements are parsed down, which requires an in-depth look at how the parsing method of each declaration statement is implemented

Go syntax parsing detailed process

For the parsing of various declarations and expressions in Go syntax analysis, you can find them in Src / CMD / compile / internal / syntax / parser Find the corresponding method in Go

Variable declaration & import declaration resolution

The variable declaration is used to call the relevant resolution methods in Src / CMD / compile / internal / syntax / parser Go → fileOrNil() method

Import declaration resolution

In Src / CMD / compile / internal / syntax / parser You can see the following code in go → fileOrNil()

for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
    }

In this appendGroup, the importDecl() method will eventually be called (it can be found from the above code that the parsing method of the import declaration will not be executed until the token of the import is matched)

First of all, import can be used in the following ways:

import "a" //By default, a is the package name
import aAlias "x/a" // Alias the x/a package aAlias
import . "c" // Import the public symbols of dependent packages directly into the namespace of the current file
import _ "d" // Only importing dependent packages triggers the initialization of their packages, but does not import any symbols into the current file namespace

For space reasons, I won't stick out the source code of the importDecl() method. I sort out what it does here:

  1. Create a structure of ImportDeal (which will eventually be append ed to File.DeclList)
  2. Initialize some data in the structure, such as the location information of the parsed token, group, etc
  3. Then match the next token to see if it is_ Token type (identifier) of Name, or_ Dot(.)
  4. If the obtained token is_ Name, the package name is obtained. If yes, the package name is obtained_ Dot(.), Create a new name
  5. Then there is the matching package path, which is mainly implemented by the oliteral() method
  6. Returns the ImportDeal structure

What is worth mentioning here is the oliteral() method, which will get the next token to see if it is a token of basic face value type, that is_ Literal

💡 Tips: the basic face value has only integer, floating point number, complex number, Rune and string types

If it is a base face value type, it will create a structure node of the base face value type Basiclit, and initialize some information about it

BasicLit struct {
        Value string   //value
        Kind  LitKind  //Base face value of that type, range (IntLit, FloatLit, imagelit, RuneLit, StringLit)
        Bad   bool // true means the literal Value has syntax errors
        expr
}

It is not difficult to see that when the basic face value type is parsed, it is already indecomposable, which is the terminator in grammar. In the syntax tree, these terminators are leaf nodes

Relevant methods are provided in the go standard library to test syntax parsing. Here, I test the results of syntax parsing during import through the interface provided by go/parser:

💡 Tips: same as lexical analysis (you can have a look Go lexical analyzer In the standard library testing part of this article, the implementation of syntax analysis in the standard library is different from that in the go compiler. It is mainly the design of structure (for example, if the structure of the node changes, you can see it yourself. It is relatively simple. If you understand the implementation of syntax analysis in the compiler, you can also understand it in the standard library), the implementation idea is the same

package main

import (
    "fmt"
    "go/parser"
    "go/token"
)
const src = `
package test
import "a"
import aAlias "x/a"
import . "c"
import _ "d"
`

func main() {
    fileSet := token.NewFileSet()
    f, err := parser.ParseFile(fileSet, "", src, parser.ImportsOnly) //parser.ImportsOnly mode, which means that only package declarations and imports are resolved
    if err != nil {
        panic(err.Error())
    }
    for _, s := range f.Imports{
        fmt.Printf("import: name = %v, path = %#v\n", s.Name, s.Path)
    }
}

Print results:

import: name = <nil>, path = &ast.BasicLit{ValuePos:22, Kind:9, Value:"\"a\""}
import: name = aAlias, path = &ast.BasicLit{ValuePos:40, Kind:9, Value:"\"x/a\""}
import: name = ., path = &ast.BasicLit{ValuePos:55, Kind:9, Value:"\"c\""}
import: name = _, path = &ast.BasicLit{ValuePos:68, Kind:9, Value:"\"d\""}

You can test various types of declaration parsing or expression parsing in the standard library. I won't show the tests one by one below (the test method is the same, but the fields you need to print need to be changed)

type declaration resolution

When the parser gets_ When type is a token, it will call the type parsing method to parse. Similarly, in fileOrNil(), you can see the following code:

......
case _Type:
            p.next()
            f.DeclList = p.appendGroup(f.DeclList, p.typeDecl)
......

It will call the typeDecl() method in appendGroup, which is based on the grammar declared by type type, which is introduced in the front. We know that type has the following usage:

type a string
type b = string

Let's look at what has been done in this method:

  • Create a TypeDecl structure (which will eventually be append ed to File.DeclList)
  • Initialize some data in the structure, such as the location information of the parsed token, group, etc
  • Whether the next token is_ Assign. If yes, get the next token
  • Verify the type of the next token, such as chan, struct, map or func (implemented in the typeOrNil() method, it is actually a pile of switch case s)
  • Returns the TypeDecl structure

When getting the type of the token on the far right, you need to continue parsing according to its type. If it is a chan type, it will create a chan type structure and initialize the information of this chan type (in the process of parsing, each type of structure created is a node)

You can also use go/parser → ParseFile to test the parsing of type

const type declaration resolution

When the parser gets_ When const this token, it will call const's parsing method to parse it. Similarly, in fileOrNil(), you can see the following code:

......
case _Const:
            p.next()
            f.DeclList = p.appendGroup(f.DeclList, p.constDecl)
......

const has the following usage:

const A = 666
const B float64 = 6.66
const (
    _   token = iota
    _EOF       
    _Name    
    _Literal
)

Then you can look at the concrete implementation of constDecl method (in fact, it is parsed according to the syntax of const declaration). It will not be repeated here. It is similar to the type resolution above. The corresponding structure is created first, and then some information of the type is recorded. The difference is that it has a list of names, because const can declare multiple names at the same time. The parsing method is parser go→nameList()

var type declaration resolution

When the parser gets_ When var is a token, it will call the var parsing method to parse. Similarly, in fileOrNil(), you can see the following code:

......
case _Var:
            p.next()
            f.DeclList = p.appendGroup(f.DeclList, p.varDecl)
......

Like the two declarations above, the corresponding parsing method will be called. The difference of var is that its declaration may involve expressions, so the parsing method of var involves the parsing of expressions. I will analyze the parsing of expressions in detail later

Function declaration parsing implementation

💡 Note: a figure will be added to show the process of function analysis

Finally, the function declaration is parsed. As mentioned earlier, the structure of the function declaration node is as follows:

FuncDecl struct {
        Pragma Pragma
        Recv   *Field // recipient
        Name   *Name  //Function name
        Type   *FuncType //Function type
        Body   *BlockStmt // Function body
        decl
    }

In fileOrNil(), you can see the following code:

case _Func:
            p.next()
            if d := p.funcDeclOrNil(); d != nil {
                f.DeclList = append(f.DeclList, d)
            }

The core method of parsing function is funcDeclOrNil. Because the function parsing is a little more complex, I'll stick out its implementation here and explain what each line of code is doing through comments

// Grammar of function
// FunctionDecl = "func" FunctionName ( Function | Signature ) .
// FunctionName = identifier .
// Function     = Signature FunctionBody .

// Grammar of method
// MethodDecl   = "func" Receiver MethodName ( Function | Signature ) .
// Receiver     = Parameters . // Recipient of method
func (p *parser) funcDeclOrNil() *FuncDecl {
    if trace {
        defer p.trace("funcDecl")()
    }

    f := new(FuncDecl) //Create structure (node) of function declaration type
    f.pos = p.pos()
    f.Pragma = p.takePragma()

    if p.tok == _Lparen { //If the left parenthesis is matched (it indicates the method)
        rcvr := p.paramList() //Get recipient list
        switch len(rcvr) {
        case 0:
            p.error("method has no receiver")
        default:
            p.error("method has multiple receivers")
            fallthrough
        case 1:
            f.Recv = rcvr[0]
        }
    }

    if p.tok != _Name { //Judge whether the next token is an identifier (i.e. function name)
        p.syntaxError("expecting name or (")
        p.advance(_Lbrace, _Semi)
        return nil
    }

    f.Name = p.name()
    f.Type = p.funcType() //Get type (continue to understand its internal implementation below)
    if p.tok == _Lbrace { // If it matches the left bracket, the function body is parsed
        f.Body = p.funcBody() //Parse the function body (continue to understand its internal implementation below)
    }

    return f
}

Two important implementations of function parsing: funcType(), funcBody(). What did they do internally?

/*
FuncType struct {
        ParamList  []*Field
        ResultList []*Field
        expr
}
*/
func (p *parser) funcType() *FuncType {
    if trace {
        defer p.trace("funcType")()
    }

    typ := new(FuncType) //Create a function type structure (the main members are parameter list and return value list)
    typ.pos = p.pos()
    typ.ParamList = p.paramList() //Get the parameter list (it returns a Field structure whose members are parameter names and types)
    typ.ResultList = p.funcResult() //Get the list of return values (it also returns a Field structure)

    return typ
}
func (p *parser) funcBody() *BlockStmt {
    p.fnest++ // Record the call level of the function
    errcnt := p.errcnt // Record the current number of errors
    body := p.blockStmt("") // Parse the statements in the function body (see the specific implementation below)
    p.fnest--

    // Don't check branches if there were syntax errors in the function
    // as it may lead to spurious errors (e.g., see test/switch2.go) or
    // possibly crashes due to incomplete syntax trees.
    if p.mode&CheckBranches != 0 && errcnt == p.errcnt {
        checkBranches(body, p.errh)
    }

    return body
}

func (p *parser) blockStmt(context string) *BlockStmt {
    if trace {
        defer p.trace("blockStmt")()
    }

    s := new(BlockStmt) //Create the structure of the function body
    s.pos = p.pos()

    // people coming from C may forget that braces are mandatory in Go
    if !p.got(_Lbrace) {
        p.syntaxError("expecting { after " + context)
        p.advance(_Name, _Rbrace)
        s.Rbrace = p.pos() // in case we found "}"
        if p.got(_Rbrace) {
            return s
        }
    }

    s.List = p.stmtList() //Start parsing the declaration and expression in the function body (the implementation of the edge here is to judge which declaration or statement is based on the obtained token, and it is also implemented through switch case to parse the corresponding grammar according to the matching type)
    s.Rbrace = p.pos()
    p.want(_Rbrace)

    return s
}

The above function analysis process is shown in the figure for easy understanding:

You can see how to parse some functions in the function body, such as assignment, for, go, defer, select, etc

summary

This paper mainly shares the process of syntax parsing as a whole, and briefly shows the internal implementation of type, const and func declaration parsing. In fact, there are expression parsing, including some other syntax parsing. There are many contents, which can't be introduced one by one. Interested partners can study them by themselves

Thanks for reading. The next topic is: abstract syntax tree construction

Keywords: Go Back-end

Added by arnoldd99 on Sat, 08 Jan 2022 07:20:23 +0200