1. Experimental Requirements
1. Goals
Implement an interpretive language, Ulanguage, in C, so that it can be executed in the terminal
Java and Python are explanatory languages that require an interpreter to interpret execution at execution time - interpret an instruction when you see it (advantages: cross-platform; disadvantages: inefficiency).
2.U Language functionality
(1) Basic requirements
- Supports base variable data types int and string
- Supports operations such as declaration, assignment, evaluation, addressing, calculation of (global) variables (undeclared unavailable, error)
Int type variable support: +, -, *, /etc. - String type variable support: + (stitching), size(s) take length,! (small letters), eight (large letters), etc.
- Supports if(exp)statementlelse statement2 endif conditional compound statements (no nesting is required, only single-statement compound is supported)
Supports temp variable storage (unassigned results exist in temp) - Supports printing listMem and listSymbolTable2 for global variable and symbol tables (+5~8 points)
(2) Improve requirements
- Support function definition define f return type argutype1 argutype2 statements return enddefine
- Supports type and statement storage of function variables
- Support for function calls (activating storage and symbol tables)
- Supports local variable definitions, gives preference to local variables
- Optional: Supports while(exp)statement1 statement2 endwhile operations
3. Concerns
- Through this experiment, we can consolidate the practice, deepen the understanding of the process we learned about the use and management of memory, process address space and other knowledge.
- In accordance with the basic computer operations - assignment, calculation, address, value, declaration to restore the program execution process as much as possible, understand the rationality and necessity of the program execution process, and understand the necessity of syntax in the process of writing the program.
- Restore the storage, query and invocation of global variables in memory and symbol tables according to what you have learned.
- For a function, you can focus on the memory storage and use of its private variables, regardless of the access chain, control chain, and other complications in the Activation Record (regardless of recursive calls to the function).
- In addition to global and private variables, the implementation of registers needs to be considered.
- Supports the definition and call of functions, and can call global variables within functions.
- Supports while and if statements.
- While programming, think about the syntax structure of the language, and have error information for basic grammar errors.
4. Data structure used in this experiment
If the memory table, symbol table, process address space, function activation and other knowledge points in the experimental requirements are clear, you can skip the text section and look directly at the code.
(1) Symbol table
What is a symbol table?
- During the compilation process, the compiler needs to constantly verify the attributes and characteristics of the variable names that appear in the source program.
- The compiler uses symbol tables to record the scope of names and binding information;
- The information in the symbol table is used in semantic analysis at different stages of compilation.
- Symbol tables are the basis for variable address assignment.
Common attributes of symbol tables:
Symbol name Types of symbols Storage category of symbols Scope and Visibility of Symbols Storage allocation information for symbol variables Other attributes of symbols array dope vector Record member information for a structured type Parameters of Functions and Processes
In this experiment, the structure of the symbol table is simply considered as:
#define SIZE_OF_SYM 1000 //Symbol Table Size bool func_lock = false; //Function "lock", to distinguish between global and local variables, lock before executing the function, unlock when the function is finished typedef struct ST{ //The structure of the symbol table string varName; //Variable Name int addrr; //Starting location of variable actual value storage string varType; //Variable types (note that functions also have types) int varLength; //Variable Length }ST; ST symList[SIZE_OF_SYM];//Symbol table, supporting up to 1000 variables at the same time
(2) Memory tables
The actual value of the variable is not stored in the symbol table, but in the memory table, which only stores the storage allocation information of the variable. By "memory table" we mean the address space of memory.
It can be understood that a symbol table is like a fast indexed directory for registers to quickly verify the actual memory address of a variable.
#Define SIZE_ OF_ The address space of MRY 10000//memory table is 0~9999 int mry[SIZE_OF_MRY]; //Memory table, this experiment uses C int to store data //Instant to simulate address units
(3) Registers
Used to store calculation intermediate procedures;
This experiment calls registers only for the results of calculations without assignment operations, including function calls.
typedef struct temp{ string varType; string varValue; }temp; temp tmp;//Temporary variable memory (assuming only one)
(4) Process address space
- Knowledge of process address space is highly recommended for viewing this blog:
String up process lists, process control blocks, inode nodes, file descriptor lists, file entities, file systems, etc. with pictures
(5) Activation Record/Function Activation/Stack Frame
Cite the classic explanation of Baidu Encyclopedia:
"Stack frames, also known as procedural activity records, are a data structure used by compilers to implement procedural/function calls."
Fear that some of your little buddies are confused and write out several common terms...
For example, for Unix, what you need to save for general function activation is:
parameter
type
Local data or temporary variable storage location in a function
Control chain: Points to function activation that calls the function
Return value: Activate use of the previous function (according to stack frame stack order)
Access chain: Function activation pointing to a previous function other than the function itself
Status Recording: Used to record the status of a process when it is waiting to be stopped
When calling a function:
1. Loading text and data segments (etc) ----- executing main------- allocating active records/stack frames in the stack;
2. Place arguments (from the previous function activation, global variables, etc.) in the argument table;
3. The result of calculation is stored in the return value.
4. Write back the value according to the corresponding statement found in the control chain;
5. Release the activation record;
/* * This experiment temporarily considers the case that there is only one stack frame, and does not consider the function access chain and control chain. * That is, multiple function nesting and function recursion are not considered. * (ps: Actually, it can also be considered, but there is no extra time to write if you have limited study time.) */ #Define SIZE_ OF_ MRY_ Memory table size for FUNC 1000 //Activated records #define SIZE_OF_SYM_FUNC 100 //Symbol Table Size of Activated Record typedef struct funcActivation{ //Activate Record Structures string funcName; //Function name string returnType; //Return value string returnVar; //Return value string funcVar1; //Parameter 1 string funcVar2; //Parameter 2 (only binary functions are considered in this experiment) ST func_ST[SIZE_OF_SYM_FUNC]; //Local variables in function activation (local symbol table), up to 100 local variables supported int func_mry[SIZE_OF_MRY_FUNC]; //Local memory table }funcActivation; funcActivation fat; //This experiment temporarily considers the case where there is only one stack frame
2. Demonstration of Results
Simple assignment, declaration, and calculation of type int:
"?" is the meaning of the output variable value.
After all, it's your own language, how to output it is up to you ~
Declarations, assignments, and calculations on string classes:
Explanation'|'and'^' are bitwise small strings and bitwise large strings, respectively.
View the symbol and memory tables:
[Instructions]
symtab is to view the global symbol table
Memory is to view the global memory table.
Take variable address and variable length:
It is worth mentioning that the memory management method used in this experiment is FIFO first-time adaptive algorithm.
Function definition (note the syntax):
Explanation The interpreter returns the data type of the function in real time (#System notice field) after defining the function name, return type, and parameters.
Note: Since only one stack frame is emulated in the program, the previous stack frame needs to be ejected when definine new functions are applied (in fact, it erases the information that it is left at the local address).
Write a MAX function to demonstrate:
Complete syntax for function definition:
- define function name
Return Type parameter type (return value type)
Parameter Type Parameter Name
Parameter Type Parameter Name
...
Sentence
...
return parameter name
endDefine;
Default text address set at 10001
Note that information about the function name is stored in the global symbol table.
Function call:
[Instructions]
Syntax for function calls: function name (val1, val2)
Demonstration of Registers
View register contents: tmptab
Implementation of while and if statements & global variables are supported within functions
[Instructions]
- while statement syntax format:
while (expression) executes the statement endWhile - if statement syntax format:
if (exp) statement1 else statement2 endif
Note: The if statement may also not use else.
Call the mega function you just defined:
Look at the results:
You can see that the global variable globalis assigned because the function call (based on the results of while s and if within the function) is called.
[Instructions]
View the local symbol table: fats
View local memory table: fatm
3. Complete Code
In the spirit of open source and sharing, learning from each other, and feeding back to the community, put the complete code on the students. Thank you for your favorite collection and attention.
#include<iostream> #include<stdlib.h> #include<stdio.h> #include<string> #include<cstring> #include<iomanip> #include<malloc.h> #include<vector> #include<conio.h> #include<sstream> #include <algorithm> using namespace std; //By predefining these constants, you can eliminate the time consumed by calculating the length of the array #define SIZE_OF_MRY 10000 // Memory Table #define SIZE_OF_SYM 1000//Symbol Table #define SIZE_OF_MRY_FUNC 1000 //Memory table for activating records #define SIZE_OF_SYM_FUNC 100//Symbol Table of Activated Records bool func_lock = false; //Function lock, lock before function execution, unlock when function execution is complete typedef struct ST{ string varName; int addrr; string varType; int varLength; }ST; ST symList[SIZE_OF_SYM];//Symbol table, supporting up to 1000 variables at the same time int mry[SIZE_OF_MRY];//Memory table typedef struct temp{ string varType; string varValue; }temp; temp tmp;//Temporary variable memory typedef struct funcActivation{ string funcName; string returnType; string returnVar; string funcVar1; string funcVar2; ST func_ST[SIZE_OF_SYM_FUNC]; //Local variables in function activation, up to 100 local variables supported int func_mry[SIZE_OF_MRY_FUNC]; }funcActivation; funcActivation fat; int getSTLen(ST s[]){ if(s == symList){ return SIZE_OF_SYM; }else if(s == fat.func_ST){ return SIZE_OF_SYM_FUNC; } } int getMRYLen(int m[]){ if(m == mry){ return SIZE_OF_MRY; }else if(m == fat.func_mry){ return SIZE_OF_MRY_FUNC; } } vector<string> split(const string& str, const string& delim) { vector<string> res; if("" == str) return res; char * strs = new char[str.length() + 1] ; strcpy(strs, str.c_str()); char * d = new char[delim.length() + 1]; strcpy(d, delim.c_str()); char *p = strtok(strs, d); while(p) { string s = p; //Convert the split string to string type res.push_back(s); //Save in Result Array p = strtok(NULL, d); } return res; } string intToStr(int ans){ stringstream s; s << ans; string p = s.str(); const char* res = p.c_str(); return res; } void memory(int *mry){ cout<<"************ Memory List ************"<<endl; cout<<"addrress"<<'\t'<<"value"<<endl; int len = getMRYLen(mry); for(int i = 1; i < len; i++ ){ if(mry[i] != 0) cout<<i<<'\t'<<mry[i]<<endl; } cout<<endl; } void fat_memory(int *mry){ cout<<"************ Memory List of Activation Record ************"<<endl; cout<<"addrress"<<'\t'<<"value"<<endl; for(int i = 1; i < 1000; i++ ){ if(fat.func_mry[i] != 0) cout<<i<<'\t'<<fat.func_mry[i]<<endl; } cout<<endl; } void symctl(ST *symList){ cout<<"************ Symbol List ************"<<endl; int len = getSTLen(symList); cout<<"varType"<<'\t'<<"varName"<<'\t'<<" varAddrr"<<'\t'<<"varLength"<<endl; for(int i = 0; i < len; i++){ if(symList[i].varName == "") break; cout<<symList[i].varType<<'\t'<<symList[i].varName<<"\t \t"<<symList[i].addrr<<"\t\t"<<symList[i].varLength<<endl; } cout<<endl; } void showTmp(){ cout<<"********** register **********"<<endl; cout<<"Stored data type:"<<tmp.varType<<endl; cout<<"Stored data values: "<<tmp.varValue<<endl; } void declear(string type, ST *symList){ if(type == "string" || type == "int"){ string varName; cin >> varName; int len = getSTLen(symList); for(int i = 0; i < len; i++){ //Memory table sequential storage management if( symList[i].varName == "" ){ symList[i].varName = varName; //Add Symbol Table symList[i].varType = type; break; } if(symList[i].varName == varName){ cout<<"Variable name conflict! This varluable has been defined already!"<<endl<<endl; break; } } }else{ cout<<"The data type is not supported !"<<endl; } } void func_declear(string type, string varName, ST *symList){ if(type == "string" || type == "int"){ int len = getSTLen(symList); for(int i = 0; i < len; i++){ //Memory table sequential storage management if( symList[i].varName == "" ){ symList[i].varName = varName; //Add Symbol Table symList[i].varType = type; break; } if(symList[i].varName == varName){ cout<<"Variable name conflict! This varluable has been defined already!"<<endl<<endl; break; } } }else{ cout<<"The data type is not supported !"<<endl; } } int findVar(string cmd1,ST *symList){ int len = getSTLen(symList); for(int i = 0; i < len; i++){ if(symList[i].varName == cmd1) return i; else if(symList[i].varName == "") return -1; } } void op_againAssignment(string varValue, int keyInST,ST *symList, int *mry){ if(symList[keyInST].varType == "int"){ mry[symList[keyInST].addrr] = atoi(varValue.c_str()); }else if(symList[keyInST].varType == "string"){ for(int i = symList[keyInST].addrr; i < symList[keyInST].addrr+symList[keyInST].varLength; i++){ mry[i] = 0; } int len = getMRYLen(mry); for(int i = 1; i < len; i++){ if(mry[i] == NULL){ int j = i; for(; j < i+varValue.length(); j++){ //First-time adaptation method, considering fragmentation if(mry[j] != NULL){ i =j; break; } } if(j == i+varValue.length()){ //Determining the location of j at the end of the loop determines whether or not it is stored here for(int j = i, seq = 0; j < i+varValue.length(); j++, seq++){ mry[j] = varValue[seq]; //Store each char in the string as an ASSIC code in an integer array } symList[keyInST].varLength = varValue.length(); symList[keyInST].addrr = i; break; } } } } } void op_assignment(string varValue, int keyInST, ST *symList, int *mry){ if(symList[keyInST].varType == "int"){ int value = atoi(varValue.c_str()); if(value == 0 && varValue != "0"){ cout<<"Data type mismatch! Assignment failed!"<<endl; } else{ int len = getMRYLen(mry); for(int i = 1; i < len; i++){ if(mry[i] == NULL){ mry[i] = value; symList[keyInST].addrr = i; symList[keyInST].varLength = 1; //Plastic occupies an int position (4Bytes) break; } } } }else if( symList[keyInST].varType == "string"){ int len = getMRYLen(mry); for(int i = 1; i < len; i++){ if(mry[i] == NULL){ int j = i; for(; j < i+varValue.length(); j++){ //First-time adaptation method, considering fragmentation if(mry[j] != NULL){ i =j; break; } } if(j == i+varValue.length()){ //Determining the location of j at the end of the loop determines whether or not it is stored here for(int j = i, seq = 0; j < i+varValue.length(); j++, seq++){ mry[j] = varValue[seq]; //Store each char in the string as an ASSIC code in an integer array } symList[keyInST].varLength = varValue.length(); symList[keyInST].addrr = i; break; } } } } } string test(string elm, ST sym[], int m[]); int fetchInt(int key, ST *symList, int *mry); int op_culInt(string exp){ string elm1; string elm2; char op; for(int i = 0; i < exp.length(); i++){ if(exp[i] == '*' || exp[i] == '/' || exp[i] == '+' || exp[i] == '-'){ elm1 = exp.substr(0,i); elm2 = exp.substr(i+1,exp.length()-1); op = exp[i]; } } string Elm1 = test(elm1,symList,mry); string Elm2 = test(elm2,symList,mry); int intElm1 = atoi(Elm1.c_str()); int intElm2 = atoi(Elm2.c_str()); if( (intElm1 == 0)&&(Elm1 != "0") || (intElm2 == 0)&&(Elm2 == "0") ){ cout<<"Number type mismatch! Error in calculation!"<<endl; return -1; } switch (op) { case '*': return intElm1*intElm2;break; case '/': return intElm1/intElm2;break; case '+': return intElm1+intElm2;break; case '-': return intElm1-intElm2;break; defalut: cout<<"Calculation module failure!"<<endl<<endl; } } string test(string elm, ST *symList, int *mry); string op_culString(string exp,ST *symList, int *mry){ string elm1; string elm2; char op; for(int i = 0; i < exp.length(); i++){ if(exp[i] == '|' || exp[i] == '^' || exp[i] == '+' ){ elm1 = exp.substr(0,i); elm2 = exp.substr(i+1,exp.length()-1); elm1 = test(elm1,symList,mry); elm2 = test(elm2,symList,mry); op = exp[i]; } } int intElm1 = atoi(elm1.c_str()); int intElm2 = atoi(elm2.c_str()); switch (op) { case '+': { string ans = elm1+elm2; return ans; break; } case '|': { int minLen = min(elm1.length(),elm2.length()); char res[minLen]; int i = 0; for(; i < minLen; i++){ res[i] = (elm1[i]>elm2[i])?elm1[i]:elm2[i]; } string ans = res; if(elm1.length() != elm2.length()){ string back = (elm1.length() > elm2.length())?elm1.substr(i,elm1.length()-1):elm2.substr(i,elm2.length()-1); ans = ans + back; } return ans; break; } case '^': { int minLen = min(elm1.length(),elm2.length()); char res[minLen]; int i = 0; for(; i < minLen; i++){ res[i] = (elm1[i]<elm2[i])?elm1[i]:elm2[i]; } string ans = res; if(elm1.length() != elm2.length()){ string back = (elm1.length() > elm2.length())?elm1.substr(i,elm1.length()-1):elm2.substr(i,elm2.length()-1); ans = ans + back; } return ans; break; } defalut: cout<<"Calculation module failure!"<<endl<<endl; } } void call_func(vector<string> line_call); void assignment(int key, ST *symList, int *mry){ bool flag_deal_func = false; //Used to mark whether a function call has been made; string cmd3; cin>>cmd3; string res; if(split(cmd3,"(").size() == 2){ flag_deal_func = true; vector<string> testify; testify.push_back(cmd3); call_func(testify); } if(symList[key].varType == "int"){ if(flag_deal_func == true){ if(fat.returnType != "int"){ cout<<"Return value type does not match variable type"<<endl; } else{ res = fat.returnVar; } } else { cmd3 = test(cmd3, symList, mry); int flag = 0; for(int i=0; i<cmd3.length(); i++){ if(cmd3[i] == '*' || cmd3[i] == '/' || cmd3[i] == '+' || cmd3[i] == '-'){ flag = 1; if(i == 0) { flag = 0; //Previous binary operations without negative numbers being considered for the time being } break; } } if(flag == 1 ){ int ans = op_culInt(cmd3); res = intToStr(ans); }else{ res = cmd3; } } if(symList[key].addrr != 0){// Case of reassignment op_againAssignment(res,key,symList,mry); }else{ // Case of initial assignment op_assignment(res,key,symList,mry); } }else if(symList[key].varType == "string"){ if(flag_deal_func == true){ if(fat.returnType != "string"){ } else{ res = fat.returnVar; } } else{ cmd3 = test(cmd3, symList, mry); int flag = 0; for(int i=0; i<cmd3.length(); i++){ if(cmd3[i] == '|' || cmd3[i] == '^' || cmd3[i] == '+'){ flag = 1; } } if(flag == 1 ){ res = op_culString(cmd3,symList,mry); }else{ res = cmd3; } } if(symList[key].addrr != 0){// Case of reassignment op_againAssignment(res,key,symList,mry); }else { op_assignment(res,key,symList,mry); } } } void all_assignment(int key, string varValue, ST *symList,int *mry) {//A collection of assignments from the first and second assignments if(symList[key].addrr != 0){// Case of reassignment op_againAssignment(varValue,key,symList,mry); }else{ // Case of initial assignment op_assignment(varValue,key,symList,mry); } } void func_assignment(string varName, string cmd3, ST *sym,int *m){ //(Assigned variable name, assigned value or formula, symbol table, memory table) int key = -1; bool flag_exist = true; ST *symList; int *mry; if(func_lock == false){ if (findVar(varName,sym) < 0) flag_exist = false; else{ symList = sym; mry = m; key = (findVar(varName,symList)); } }else if (func_lock == true){ if (findVar(varName,sym) < 0) { if (findVar(varName,::symList) < 0){ flag_exist = false; }else{ symList = ::symList; mry = ::mry; key = (findVar(varName,symList)); } }else{ symList = sym; mry = m; key = (findVar(varName,symList)); } } if(flag_exist == false){ cout<<"variable"<<varName<<"Not declared!"<<endl; } else{ //The case of int if(symList[key].varType == "int"){ int flag = 0; for(int i=0; i<cmd3.length(); i++){ if(cmd3[i] == '*' || cmd3[i] == '/' || cmd3[i] == '+' || cmd3[i] == '-'){ flag = 1; if(i == 0) { flag = 0; //Previous binary operations without negative numbers being considered for the time being } break; } } if(flag == 1 ){ int ans = op_culInt(cmd3); string res = intToStr(ans); all_assignment(key, res, symList,mry); //assignment } else{ //No calculation required if(func_lock == false){ if(findVar(cmd3,symList) > -1){ //Assign the value of variable cmd3 to variable varName if(symList[findVar(cmd3,symList)].varType == "string"){ } else if(symList[findVar(cmd3,symList)].varType == "int"){ int k = findVar(cmd3,symList); int ans = fetchInt(k,symList,mry); string res = intToStr(ans); all_assignment(key, res, symList,mry); //assignment } }else{ if(atoi(cmd3.c_str()) > 0 || cmd3 == "0" ){ all_assignment(key, cmd3, symList,mry); //assignment } } }else if(func_lock == true){ string res = test(cmd3,symList,mry); all_assignment(key, res, symList,mry); //assignment } } }else if(symList[key].varType == "string"){ int flag = 0; for(int i=0; i<cmd3.length(); i++){ if(cmd3[i] == '|' || cmd3[i] == '^' || cmd3[i] == '+'){ flag = 1; } } if(flag == 1 ){ cmd3 = op_culString(cmd3,symList,mry); }else{ cmd3 = test(cmd3,symList,mry); } if(symList[key].addrr != 0){// Case of reassignment op_againAssignment(cmd3,key,symList,mry); }else { op_assignment(cmd3,key,symList,mry); } } } } int fetchAddrr(string varName, ST *symList) { int len = getSTLen(symList); for(int i = 0; i <len; i++){ if(symList[i].varName == varName){ return symList[i].addrr; //Direct return address } } cout<<"Variable undeclared!"<<endl; return 0; } string fetchString(int key, ST *symList, int *mry){ int len = symList[key].varLength; int addrr = symList[key].addrr; char ans[len]; for(int i = addrr, seq = 0; i < addrr+len; i++, seq++){ ans[seq] = mry[i]; } string res; res += ans; return res; } int fetchInt(int key, ST *symList, int *mry){ return mry[symList[key].addrr]; } int fetchLen(string varName, ST *symList){ if(findVar(varName,symList) == -1){ cout<<"Variable undeclared!"<<endl; return -1; } else { int pos = findVar(varName,symList); //Cout<<"Location in ST:"< pos< endl; int len = symList[pos].varLength; //Cout<<"ST Medium Length:"< len< endl; return len; } } string test(string elm, ST *sym, int *m){ if(func_lock == false){ int intElm = atoi(elm.c_str()); if( intElm == 0 && elm!= "0"){ // Not a pure number, value is required if(findVar(elm,sym) < 0){ return elm; //The variable name could not be found in ST, indicating elm is a new string }else{ //Description is the variable name already declared on ST if(sym[findVar(elm,sym)].varType == "int"){ return intToStr(fetchInt(findVar(elm,sym),sym,m)); }else if(sym[findVar(elm,sym)].varType == "string"){ return fetchString(findVar(elm,sym),sym,m); } } }else{ //Is a quintessential number that can be returned directly return elm; //Returns the pure number stored as a string } } else if(func_lock == true){ //Lock, indicating that the function is being executed, not only global but also local int intElm = atoi(elm.c_str()); if( intElm == 0 && elm!= "0"){ // Not a pure number, value is required if(findVar(elm,fat.func_ST) < 0 && findVar(elm,symList) < 0){ //Find Local Find Global return elm; //The variable name could not be found in ST, indicating elm is a new string }else{ //Description is the variable name already declared on ST if(findVar(elm,fat.func_ST) >= 0) { if(sym[findVar(elm,fat.func_ST)].varType == "int"){ return intToStr(fetchInt(findVar(elm,fat.func_ST),fat.func_ST,fat.func_mry)); }else if(sym[findVar(elm,fat.func_ST)].varType == "string"){ return fetchString(findVar(elm,fat.func_ST),fat.func_ST,fat.func_mry); } } else if(findVar(elm,symList) >= 0){ if(symList[findVar(elm,symList)].varType == "int"){ return intToStr(fetchInt(findVar(elm,symList),symList,mry)); }else if(symList[findVar(elm,symList)].varType == "string"){ return fetchString(findVar(elm,symList),symList,mry); } } } }else{ return elm; } } } //The error in judging expression in if statement is ------------------------------------------------------ bool judgeExp(string exp, ST *symList, int *mry){ if(exp.find("==") != 18446744073709551615){ string elm1 = exp.substr(0,exp.find("==")); string elm2 = exp.substr(exp.find("==")+2); elm1 = test(elm1,symList,mry); elm2 = test(elm2,symList,mry); if(elm1 == elm2) return true; else return false; }else if(exp.find(">") != 18446744073709551615 ){ string elm1 = exp.substr(0,exp.find(">")); string elm2 = exp.substr(exp.find(">")+1); elm1 = test(elm1,symList,mry); elm2 = test(elm2,symList,mry); if( (atoi(elm1.c_str()) == 0 && elm1!= "0") && (atoi(elm2.c_str()) == 0 && elm2!= "0") ){ //String, directly compared by ASSIC code if(elm1 < elm2) return true; else return false; }else if(atoi(elm1.c_str()) != 0 && atoi(elm2.c_str()) != 0 ){ //String converted from pure number, converted to int for comparison int e1 = atoi(elm1.c_str()); int e2 = atoi(elm2.c_str()); if(e1 > e2) return true; else return false; }else{ cout<<"Data type mismatch!"<<endl; return false; } }else if(exp.find("<") != 18446744073709551615){ string elm1 = exp.substr(0,exp.find("<")); string elm2 = exp.substr(exp.find("<")+1); elm1 = test(elm1,symList,mry); elm2 = test(elm2,symList,mry); if( (atoi(elm1.c_str()) == 0 && elm1!= "0") && (atoi(elm2.c_str()) == 0 && elm2!= "0") ){ //String, directly compared by ASSIC code if(elm1 < elm2) return true; else return false; }else if(atoi(elm1.c_str()) != 0 && atoi(elm2.c_str()) != 0 ){ //String converted from pure number, converted to int for comparison int e1 = atoi(elm1.c_str()); int e2 = atoi(elm2.c_str()); if(e1 < e2) return true; else return false; }else{ cout<<"Data type mismatch!"<<endl; return false; } } } string cul_Statement(string stm,ST *symList, int *mry){ char op; int pos; for(int i=0; i<stm.length(); i++){ if(stm[i] == '+'){ op = '+'; pos = i; break; } else if(stm[i] == '-'){ op = '-'; pos = i; break; }else if(stm[i] == '*'){ op = '*'; pos = i; break; }else if(stm[i] == '/'){ op = '/'; pos = i; break; }else if(stm[i] == '|'){ op = '|'; pos = i; break; }else if(stm[i] == '^'){ op = '^'; pos = i; break; } } if(!op){ // Cout<<"No calculation!"< endl; return test(stm,symList,mry); } string elm1 = stm.substr(0,pos); string elm2 = stm.substr(pos+1); elm1 = test(elm1,symList,mry); elm2 = test(elm2,symList,mry); switch(op) { case '+':{ if( (atoi(elm1.c_str()) == 0 && elm1!= "0") && (atoi(elm2.c_str()) == 0 && elm2!= "0") ){ //Are strings, using the + method of strings string ans = elm1+elm2; return ans; }else if(atoi(elm1.c_str()) != 0 && atoi(elm2.c_str()) != 0 ){ // Are all reshapes, use the + method of reshaping int e1 = atoi(elm1.c_str()); int e2 = atoi(elm2.c_str()); string ans = intToStr(e1+e2); return ans; }else{ cout<<"Data type mismatch!"<<endl; return "ERROR!";//Optimize based on key } break; } case '-':{ if(atoi(elm1.c_str()) != 0 && atoi(elm2.c_str()) != 0 ){ int e1 = atoi(elm1.c_str()); int e2 = atoi(elm2.c_str()); string ans = intToStr(e1-e2); // Return value of cout<<'-:'< ans< endl; return ans; }else{ cout<<"Data type mismatch!"<<endl; return "ERROR!";//Optimize based on key } break; } case '*':{ if(atoi(elm1.c_str()) != 0 && atoi(elm2.c_str()) != 0 ){ int e1 = atoi(elm1.c_str()); int e2 = atoi(elm2.c_str()); string ans = intToStr(e1*e2); // Return value of cout<<'*:'< ans< endl; return ans; }else{ cout<<"Data type mismatch!"<<endl; return "ERROR!";//Optimize based on key } break; } case '/':{ if(atoi(elm1.c_str()) != 0 && atoi(elm2.c_str()) != 0 ){ int e1 = atoi(elm1.c_str()); int e2 = atoi(elm2.c_str()); string ans = intToStr(e1/e2); // Return value of cout <'/:'< ans < endl; return ans; }else{ cout<<"Data type mismatch!"<<endl; return "ERROR!";//Optimize based on key } break; } case '|':{ if( (atoi(elm1.c_str()) == 0 && elm1!= "0") && (atoi(elm2.c_str()) == 0 && elm2!= "0") ){ //Are strings, using the + method of strings int minLen = min(elm1.length(),elm2.length()); char res[minLen]; int i = 0; for(; i < minLen; i++){ res[i] = (elm1[i]>elm2[i])?elm1[i]:elm2[i]; } string ans = res; if(elm1.length() != elm2.length()){ string back = (elm1.length() > elm2.length())?elm1.substr(i,elm1.length()-1):elm2.substr(i,elm2.length()-1); ans = ans + back; } return ans; }else{ cout<<"Data type mismatch!"<<endl; return "ERROR!";//Optimize based on key } break; } case '^':{ if( (atoi(elm1.c_str()) == 0 && elm1!= "0") && (atoi(elm2.c_str()) == 0 && elm2!= "0") ){ //Are strings, using the + method of strings int minLen = min(elm1.length(),elm2.length()); char res[minLen]; int i = 0; for(; i < minLen; i++){ res[i] = (elm1[i]<elm2[i])?elm1[i]:elm2[i]; } string ans = res; if(elm1.length() != elm2.length()){ string back = (elm1.length() > elm2.length())?elm1.substr(i,elm1.length()-1):elm2.substr(i,elm2.length()-1); ans = ans + back; } return ans; }else{ cout<<"Data type mismatch!"<<endl; return "ERROR!";//Optimize based on key } break; } default:{ cout<<"statement Calculation module failure!"<<endl; return "ERROR!"; break; } } } void deal_Statement(string stm, ST *symList, int *mry){ int pos = -1; for(int i = 0; i < stm.length(); i++){ if(stm[i] == '='){ pos = i; break; } } if(pos == -1){ string ans = cul_Statement(stm,symList,mry); if(atoi(ans.c_str()) != 0 || ans == "0"){ tmp.varType = "int"; }else{ tmp.varType = "string"; } tmp.varValue = ans; }else{ string varName = stm.substr(0,pos); string operation = stm.substr(pos+1); string ans = cul_Statement(operation,symList,mry); func_assignment(varName,ans,symList,mry); //(Assigned variable name, assigned value or formula, symbol table, memory table) } } void fetch(string varName, ST *symList, int *mry){ if( findVar(varName,symList) >- 1){ // Cout<<"Value!"< endl; int key = findVar(varName,symList); if( symList[key].varType == "int" ){ int res = fetchInt(key,symList,mry); cout<< res<<endl<<endl; }else if( symList[key].varType == "string" ){ string str = fetchString(key,symList,mry); cout<< str<<endl<<endl; } } } vector<string> writeVector(vector<string> text){ text.clear(); string content; while(1){ getline(cin,content); if(content != "endDefine"){ text.push_back(content); }else{ break; } } return text; } void deal_if(vector<string> text,ST *symList,int *mry){ //Expect that the text here has been split string cmd1 = text[0]; string exp = cmd1.substr(3,cmd1.length()-4); string statement1 = text[1]; string cmd3 = text[2]; bool flag; if(cmd3 == "else"){ string cmd4 = text[3]; string cmd5 = text[4]; if(cmd5 != "endif"){ cout<<"if Sentence grammar error! with endif Ending!"<<endl<<endl; }else{ string statement2 = cmd4; flag = judgeExp(exp,symList,mry); if(flag == true){ deal_Statement(statement1,symList,mry); }else{ deal_Statement(statement2,symList,mry); } } }else if(cmd3 == "endif"){ flag = judgeExp(exp,symList,mry); if(flag == true){ deal_Statement(statement1,symList,mry); } else{ } }else{ cout<<"if Syntax error"<<endl; } } void deal_while(vector<string> text, ST *symList,int *mry){ string cmd1 = text[0]; string exp = cmd1.substr(6,cmd1.length()-7); string statement = text[1]; string cmd3 = text[2]; bool flag; if(cmd3 != "endWhile"){ }else{ while(1){ if(judgeExp(exp,symList,mry) == false) break; else if(judgeExp(exp,symList,mry) == true){ deal_Statement(statement,symList,mry); } } } } //Function Call Preconditions int find_func(string str,char c){ for(int i = 0; i<str.length(); i++){ if(str[i] == c){ return i; } } return -1; } void call_func(vector<string> line_call) ; //********************************************Batch function called by function************************** void deal_Line(string line, ST *symList, int *mry){ vector<string> text = split(line," "); if(text.size() == 1 && text[0] == "symtab"){ symctl(symList); } else if(text.size() == 1 && text[0] == "memory"){ memory(mry); } else if(text.size() == 1 && text[0] == "tmptab"){ showTmp(); } else if(text.size()==2 && text[0]=="&"){ cout<<fetchAddrr(text[1],symList)<<endl<<endl; } else if(text[0].substr(0,5) == "size(" && text[0][text[0].length()-1] ==')'){ string varName = text[0].substr(5,1); cout<<fetchLen(varName,symList)<<endl<<endl; } else if(text.size() == 2 && text[1]=="?"){ fetch(text[0],symList,mry); } else if(text.size()==2 && (text[0]=="string" || text[0]=="int")){ func_declear(text[0],text[1],symList); } else if(text.size() == 1){ deal_Statement(text[0],symList,mry); } else if(text.size() == 3 && text[1] == "="){ if(find( text[2].begin(),text[2].end(),'(') != text[2].end() ) { } else func_assignment(text[0], text[2],symList,mry); } else if(text[0].substr(0,2) == "if" ){ deal_if(text,symList,mry); } else if(text[0].substr(0,5) == "while" ){ deal_while(text,symList,mry); } else if(text.size() == 1 && split(text[0],"(").size() == 2){ call_func(text); } else if(text.size() == 2 && text[0] == "return"){ string ans = cul_Statement(text[1], symList, mry); if(ans != "ERROR!"){ string ansType; if( atoi(ans.c_str())==0 && ans!= "0") ansType = "string"; else ansType = "int"; if(ansType != fat.returnType) cout<<"Return value type does not match function declaration!"<<endl; else{ fat.returnVar = ans; } } } } void deal_func(vector<string> text, ST *symList, int *mry){ for(int j=0; j<text.size();j++){ string s = text[j]; deal_Line(s,symList,mry); cout<<endl; } } vector<string> text; void show_func(){ for(int k=0; k<text.size(); k++){ cout<<text[k]<<endl; } } void showReturn(){ } void showReturn(); void eraseFuncVar(){ for(int i = 0; i < SIZE_OF_SYM_FUNC; i++){ if(fat.func_ST[i].varName != ""){ fat.func_ST[i].varName = ""; fat.func_ST[i].varLength = 0; fat.func_ST[i].addrr = 0; fat.func_ST[i].varType = ""; } } for(int j = 0; j < SIZE_OF_MRY_FUNC; j++){ if(fat.func_mry[j] != 0){ fat.func_mry[j] = 0; } } } void call_func(vector<string> line_call) { int pos = find_func(line_call[0],'('); string funcName = line_call[0].substr(0,pos); if(findVar(funcName,symList) > -1){ eraseFuncVar(); string elmAll = line_call[0].substr(pos+1); elmAll = elmAll.substr(0,elmAll.length()-1); vector<string> elm = split(elmAll,","); string elm1 = test(elm[0],symList,mry); string elm2 = test(elm[1],symList,mry); string elm1Type = (atoi(elm1.c_str()) != 0)?"int":"string"; string elm2Type = (atoi(elm2.c_str()) != 0)?"int":"string"; string funcType_ST = symList[findVar(funcName,symList)].varType; vector<string> typeArr = split(funcType_ST,","); string returnType = typeArr[2].substr(0,typeArr[2].length()-1); string funcType_real = "func("+elm1Type+","+elm2Type+","+returnType+")"; if(funcType_ST != funcType_real) cout<<"Parameter type mismatch!"<<endl; else{ func_declear(elm1Type,fat.funcVar1,fat.func_ST); func_assignment(fat.funcVar1, elm1,fat.func_ST,fat.func_mry); func_declear(elm2Type,fat.funcVar2,fat.func_ST); func_assignment(fat.funcVar2, elm2,fat.func_ST,fat.func_mry); func_lock = true; deal_func(text,fat.func_ST,fat.func_mry); func_lock = false; showReturn(); } }else{ cout<<"The function is not declared!"<<endl; } } void writeFunc(){ string funcType[3]; //Used to store function types string cmd2; cin>>cmd2; fat.funcName = cmd2; int keyInST; for(int i = 0; i<SIZE_OF_SYM_FUNC; i++){ if(symList[i].varName == ""){ symList[i].varName = cmd2; symList[i].addrr = 10001; //Suppose the file that stores the body segment keyInST = i; break; } } string cmd3; cin>>cmd3; if(cmd3 == "returnType"){ string cmd4; cin>>cmd4; if(cmd4 == "int"|| cmd4 == "string"){ fat.returnType = cmd4; funcType[0] = cmd4; string var1Type; cin>>var1Type; funcType[1] = var1Type; string var1Name; cin>>var1Name; fat.funcVar1 = var1Name; string var2Type; cin>>var2Type; funcType[2] = var2Type; string var2Name; cin>>var2Name; fat.funcVar2 = var2Name; symList[keyInST].varType = "func("+funcType[1]+","+funcType[2]+","+funcType[0]+")"; cout<<"#System notice --- The type of this function:"<< symList[keyInST].varType <<endl; text = writeVector(text); text.erase(text.begin()); }else{ cout<<"Return value type must be int or string!"<<endl; } }else{ cout<<"returnType Syntax error!"<<endl; } } void eraseFuncInST(){ for(int i=0; i<SIZE_OF_SYM; i++){ if(split(symList[i].varType,",").size() == 3 && symList[i].varLength == 0){ symList[i].varName = ""; break; } } } int whereFunc(string str){ int pos = find_func(str,'('); string funcName = str.substr(0,pos); int posInST = findVar(funcName,symList); if(posInST > -1){ return posInST; }else{ return -1; } } void controller(){ while(1){ string cmd1; cin>>cmd1; if(cmd1 == "define"){ eraseFuncInST(); writeFunc(); } else if(cmd1 == "exit") break; else if(cmd1 == "int" || cmd1 == "string"){ declear(cmd1,symList); } else if(cmd1 == "symtab"){ symctl(symList); } else if(cmd1 == "fats"){ symctl(fat.func_ST); } else if(cmd1 == "memory"){ memory(mry); } else if(cmd1 == "fatm"){ fat_memory(fat.func_mry); } else if(cmd1 == "tmptab"){ showTmp(); } else if( (findVar(cmd1,symList)) > -1){ int key = findVar(cmd1,symList); string cmd2; cin>>cmd2; if( cmd2 == "="){ assignment(key,symList,mry); }else if( cmd2 == "?"){ if( symList[key].varType == "int" ){ int res = fetchInt(key,symList,mry); cout<< res<<endl<<endl; }else if( symList[key].varType == "string" ){ string str = fetchString(key,symList,mry); cout<< str<<endl<<endl; } }else if(cmd2 == "+" || cmd2 == "-" || cmd2 == "*" || cmd2 == "/" || cmd2 == "|" || cmd2 == "^"){ string cmd3; cin >> cmd3; string statement = cmd1+cmd2+cmd3; deal_Statement(statement,symList,mry); } } else if(cmd1[0] == '&'){ string varName = cmd1.substr(1); cout<<fetchAddrr(varName,symList)<<endl<<endl; } else if(cmd1.substr(0,5) == "size(" && cmd1[cmd1.length()-1] ==')'){ cout<<"Take Length"<<endl; int end = cmd1.length()-6; string varName = cmd1.substr(5,end); cout<<fetchLen(varName,symList)<<endl<<endl; } else if(cmd1.substr(0,3) == "if(" && cmd1[cmd1.length()-1] ==')'){ string exp = cmd1.substr(3,cmd1.length()-4); string cmd2; cin >> cmd2; string statement1 = cmd2; string cmd3; cin >> cmd3; bool flag; if(cmd3 == "endif"){ flag = judgeExp(exp,symList,mry); if(flag == true){ deal_Statement(statement1,symList,mry); } else{ } }else if(cmd3 == "else"){ string cmd4; cin>>cmd4; string cmd5; cin>>cmd5; if(cmd5 != "endif"){ cout<<"if Sentence grammar error! with endif Ending!"<<endl<<endl; continue; }else{ string statement2 = cmd4; flag = judgeExp(exp,symList,mry); if(flag == true){ deal_Statement(statement1,symList,mry); }else{ deal_Statement(statement2,symList,mry); } } }else{ cout<<"if Sentence syntax error!"<<endl; } } else if(cmd1.substr(0,5) == "while" ){ string cmd2; cin>>cmd2; string cmd3; cin>>cmd3; vector<string> testWhile; testWhile.push_back(cmd1); testWhile.push_back(cmd2); testWhile.push_back(cmd3); deal_while(testWhile,symList,mry); } else if(split(cmd1,"(").size() == 2 && split(cmd1,",").size() == 2 && whereFunc(cmd1) != -1){//Function Use Registers func_lock = true; vector<string> testify; testify.push_back(cmd1); call_func(testify); tmp.varValue = fat.returnVar; tmp.varType = fat.returnType; func_lock = false; } else if(cmd1 != ""){ string cmd2; cin>>cmd2; if(cmd2 == "+" || cmd2 == "-" || cmd2 == "*" || cmd2 == "/" || cmd2 == "|" || cmd2 == "^"){ string cmd3; cin >> cmd3; string statement = cmd1+cmd2+cmd3; deal_Statement(statement,symList,mry); }else if(cmd2 == "="){ string cmd3; cin>>cmd3; cout<<"variable!"<<cmd1<<"Not declared!"<<endl; continue; } } else{ cout<<"synax error"<<endl; } } } int main(){ controller(); cout<<"Bye-bye!"<<endl; return 0; }
4. Talk about some gains and insights after the experiment
(1) It took two or three days to get a better understanding of the basic operation of cpu and VM
I didn't know the basic operation of CPU in the process of function execution before. After first learning, the understanding of knowledge points only stays in memory and has a concept about it. But after this experiment, I know the process, necessity and rationality of the basic operation combination of cpu. At the same time, we have a deeper understanding and agreement on how global and local variables are stored and indexed in user process space.
(2) fully realize the importance of function reuse
According to the working principle of the computer, the redundancy of the code can be greatly reduced by writing the basic operations such as declaration, assignment, value taking, address, calculation, etc. into "atomic operations" in the design of the program.
Global ST and Activation Record ST use pointers as parameters of functions to facilitate support global variables in functions instead of writing code that repeats a lot with global variables for the execution of functions in custom languages.
(3) A deeper understanding of syntax
There is a small problem when rewriting the program - how do I determine if a "string" is a variable name or a temporary string? Because this was not taken into account in the initial grammar design and no restrictions were placed on the format of the string, the string in the interpreter could not have the same life as the declared variable. The importance of double quotation marks''restricting the string format can also be extended to char s, functions, float s, and so on.
(4) The benefits of programming ideas:
To solve the problem of looking up the tables of global and local variables in order to implement the function function function of the interpreter, but not writing redundant code for the function call again, I contact the "lock" concept in the operating system to set a bool variable and to process the function call (several functions) It encapsulates a function and locks it before executing the function call, checks local and global for the related basic operations during the lock, and unlocks the function call after execution, so that the basic operations under the global scope only look at global tables;
_The problem of storing code snippets and function calls is a real headache when you first know the subject of an experiment; first, how to run multiple instructions in the same batch, and after thinking about it, decide to store the instructions in a "container" in a syntactical format and execute them through loops when a function is called (loops parse the sequence of instructions in a container);
For the choice of containers, one method is to use files to save, but IO operations on the disk will reduce efficiency, plus the small size of the interpreter, which probably won't result in a large memory footprint, so when you think about it, you decide to use the C++ vector s container to store code snippets.
(5) Better understanding of address space management and pointers
For the storage form of data in Memory, if a string's pointer in the real address is present in the interpreter's virtual address, that is, in the array, the subscript of the array will not accurately represent the length of the string, that is, the stored string should not be a string type, but a string's pointer type;
After thinking about it, you decide to convert char s in string s into int-type memory and use FIFO's memory management method. This has the advantage of restoring the computer's storage structure more reasonably. Another advantage of this structure is that you can define "pointers" for your custom language interpreter Type, I think about the process of implementation, and my understanding of the pointer is further confirmed, but since the value of this experiment for my improvement is basically reflected in various basic operations, I no longer spend time to implement the function of the pointer.
(6) Disadvantages:
With respect to syntax, I realized later that a better way is to use regular matching expressions, but my proficiency in regular matching is not sufficient. If regular matching expressions can be used, more "unreasonable" restrictions on syntax can be relaxed.
5. Copyright Notes
- The software copyright of this program shall be held by the Network Engineering Teaching and Research Department of Beijing Forestry University together with the author himself.
- Welcome to forward and exchange learning, but prohibit it for any commercial use without my permission!